﻿<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title>Ted Dziuba</title>
  <link href="http://teddziuba.com/atom.xml" rel="self"/>
  <link href="http://teddziuba.com/"/>
  <updated>2010-06-13T10:50:57-07:00</updated>
  <id>http://teddziuba.com/</id>
  <author>
    <name>Ted Dziuba</name>
    <email>tjdziuba@gmail.com</email>
  </author>


  <entry>
    <title>SEO Is Mostly Quack Science</title>
    <link href="http://teddziuba.com/2010/06/seo-is-mostly-quack-science.html"/>
    <updated>2010-06-12T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2010/06/seo-is-mostly-quack-science</id>
    <content type="html">
      &lt;img src=&quot;/images/set-phasers-to-troll.jpg&quot; class=&quot;mt-image-right&quot;&gt;&lt;p&gt;Most college mathematics or computer science departments have a
      &quot;crank bin&quot;: a box with collected papers that people have sent in for
      review. There are all sorts of gems in there: a two-page proof of the
      Riemann hypothesis, a drawing that demonstrates P = NP, and of course,
      a draft of a patent application for a free energy machine.  Many
      professors just throw this crap out, but some collect it because it
      makes for a good read when you're feeling discouraged by a hard
      problem.&lt;/p&gt;

      &lt;p&gt;Me, whenever I need a pick me up, I go read some of the latest new
      techniques for SEO.  There are a handful of fundamentals about
      page design and other nitty things like URL structure that are
      generally accepted as &lt;em&gt;good SEO&lt;/em&gt;, and you can derive all of
      this from the principles of not completely failing at web design.
      Non-brain-damaged web design and link building are 100% of SEO.&lt;/p&gt;

      &lt;p&gt;Anyone who tells you different is a quack that is only trying to separate you
      from your money.&lt;/p&gt;

      &lt;p&gt;Quackery in medicine is pretty easy to spot, and quackery in
      computing is pretty similar:&lt;/p&gt;

      &lt;ul&gt;
      &lt;li&gt;&quot;Research&quot; performed by people with slim technical
      backgrounds&lt;/li&gt;
      &lt;li&gt;Suspect experimental controls or no experimental controls&lt;/li&gt;
      &lt;li&gt;Little investigation of alternative explanations for
      phenomena&lt;/li&gt;
      &lt;li&gt;Little to no data reported from findings&lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;Let's look at my favorite example: SEOMoz. It's a wealth of
      collected information about SEO, almost completely anecdotal, and of
      course you can get access to more &quot;professional&quot; information for a
      fee. There probably a place on the site where you can buy Acai
      berries, but I haven't found it yet.&lt;/p&gt;

      &lt;p&gt;SEOMoz recently &lt;a rel=&quot;nofollow&quot;
      href=&quot;http://www.seomoz.org/blog/google-vs-bing-correlation-analysis-of-ranking-elements&quot;&gt;published&lt;/a&gt;
      a correlation analysis of ranking factors for Google and Bing. First
      of all, out of all the factors they measured ranking correlation for,
      nothing was correlated above .35. In most science, correlations this
      low are not even worth publishing. As a warm-up, the author explains a
      graph that shows &lt;em&gt;negative&lt;/em&gt; correlation with rank for URL
      length and .com TLD extension, meaning that longer URLs were less high
      up in the search results as were URLs that came from .com
      domains:&lt;/p&gt;

      &lt;img clas=&quot;mt-image-center&quot; src=&quot;/images/bing-v-google-negative-corr.gif&quot;&gt;

      &lt;p&gt;This is the explanation, verbatim:&lt;/p&gt;

      &lt;blockquote&gt;The data for URL length shows that longer URLs are
      negatively correlated with ranking well. This isn't particularly
      shocking, and it probably iswise to limit the length of our URLs if we
      want to perform well in the engines. However, the second data point on
      .com TLD extensions shouldn't necessarily suggest that using .com as
      your top-level domain extension will actually negatively affect your
      rankings, but merely that all other things being equal, .com domains
      didn't perform as well in the dataset we observed as other domain
      extensions.&lt;/blockquote&gt;

      &lt;p&gt;That is not how science works. You can't discount data just because
      you feel like it.  Also notice that the most negative correlation metric they
      found was -.18. A correlation of zero suggests that the two variables
      are completely independent of one another. Such a small correlation on
      such a small data set, again, is not even worth publishing.&lt;/p&gt;

      &lt;p&gt;There is no hypothesis being tested here. It's just graphs, and
      misleading graphs at that. The sad part is, SEOMoz is as close as the
      SEO industry comes to real science. They may be presenting specious
      results in hopes of looking like they know what they're talking about,
      but at least they are collecting some sort of data.&lt;/p&gt;

      &lt;p&gt;Everything else in the field is either anecdotal hocus-pocus or a
      decree from Matt Cutts. When you hire an SEO consultant, what you are
      really paying for is domain experience in the
      not-failing-at-web-design field. It's fine to pay for this kind of
      service, but beware of anyone who claims to have studied the effects
      of different techniques. They might give you skin failure.&lt;/p&gt;

      &lt;p&gt;Update: It looks like I am not the first one to notice this. Here is a good article with more &lt;a href=&quot;http://irthoughts.wordpress.com/2010/04/23/beware-of-seo-statistical-studies/&quot;&gt;statistical formalism&lt;/a&gt; on SEOMoz quackery.&lt;/p&gt;
    </content>
  </entry>

  <entry>
    <title>The Future of Apple's Curated Computing</title>
    <link href="http://teddziuba.com/2010/05/the-future-of-apples-curated-computing.html"/>
    <updated>2010-05-15T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2010/05/the-future-of-apples-curated-computing</id>
    <content type="html">
      &lt;img src=&quot;/images/step-1-load-the-gun-step-2-kill-yourself-step-3-theres
      -no-step-3.jpg&quot; class=&quot;mt-image-right&quot;&gt;

      &lt;p&gt;
      &lt;em&gt;Here’s to the crazy ones. The misfits. The rebels. The troublemakers. The round pegs in the square holes. The ones who see things differently. They’re not fond of rules. And they have no respect for the status quo. You can praise them, disagree with them, quote them, disbelieve them, glorify or vilify them. About the only thing you can’t do is ignore them. Because they change things. They invent. They imagine. They heal. They explore. They create. They inspire. They push the human race forward.&lt;/em&gt; -- Apple ad, 1997.
      &lt;/p&gt;

      &lt;p&gt;My first computer was a 33MHz Macintosh Performa 637-CD. It had 8
      megabytes of memory. It was one of the newer Macs that featured a
      CD-ROM drive. We bought Apple's 2400 baud modem as an accessory, and
      signed up for eWorld.&lt;/p&gt;

      &lt;p&gt;The one thing that got me into exploring computers in the first
      place was learning how to hack the game Escape Velocity using ResEdit,
      Apple's pistol-without-a-safety tool that was supposed to be for
      developers but available to anyone. (I became an expert at
      fresh-installing MacOS 7.5)&lt;/p&gt;

      &lt;p&gt;At the time, I was about as fanboy as they came. This was around
      the era when Apple was long considered the walking dead, only kept
      standing by the mercy of Adobe continuing to release Photoshop, and
      Microsoft, possibly to keep the antitrust litigators out of their ass,
      publishing Office for the Mac.  Being a Mac user in the late nineties,
      you had this feeling that you were on the side of right &amp;mdash; that
      the competition wasn't about megahertz or gigabytes, but that the
      counterculture was the spark that would give way to the natural order
      of things.  We thought we were on the ground floor of the inevitable.&lt;/p&gt;

      &lt;p&gt;Apple now stands as a monument to the failure of that free-thinking
      counterculture.  We thought that freedom from the tie-wearing,
      meeting-holding, memo-dictating corporate world was going to be the
      catalyst for utopian computing, and that Steve Jobs had the vision for
      how it was all going to work. Maybe he did, maybe he didn't, but that
      utopian endgame is quickly de-evolving into a dictatorship.&lt;/p&gt;

      &lt;p&gt;It started with Apple's tight control on the iPhone app market, the
      approvals process, and the well-manicured app store. Now, Apple is not
      only dictating what applications may or may not run on the iPhone or
      iPad, but they are also dictating &lt;em&gt;the language in which apps must
      be written&lt;/em&gt;. Their justification for all of this is &quot;for the good
      of the user&quot;, but it might just be the capstone delusion of an aging
      hippie who never got a chance to run for Congress. I predict that
      within five years, Apple will begin &lt;em&gt;telling&lt;/em&gt; development shops
      what kinds of apps they should make. Why? Because it will be &quot;good for
      the user&quot;, and you know, Mr. iPad developer, apps that are good for
      users usually sail right through the approvals process. Apple's
      iPhone/iPad department will be renamed Central Planning, and may God
      help you if you cross them.&lt;/p&gt;

      &lt;p&gt;I could be wrong, though. The backlash had been pretty severe, to
      the point where it may be getting to Steve Jobs. Take, for example, a
      recent e-mail exchange he had with a Gawker reporter, in which Jobs
      took a shot:&lt;/p&gt;

      &lt;blockquote&gt;By the way, what have you done that’s so great? Do you
      create anything, or just criticize others work and belittle their
      motivations?&lt;/blockquote&gt;

      &lt;p&gt;Back when I was writing Uncov, I would see this particular flavor
      of ballache pretty frequently, and it was a very good indicator of
      the person's nerves. The thing is, being a CEO, you need to be able to
      let the critics roll off your back. We talk shit, it's our job, and
      the bigger the shit we talk, the more we get paid. Most executives
      know this, and don't respond to us trolls. It's only when they're
      starting to wear down do they bust out the Teddy Roosevelt
      man-in-the-arena speech. (By the way, Theodore Roosevelt was shot in
      the chest once, and proceeded to deliver a speech with the bullet
      still in him. He left the bullet in his body until his death seven
      years later. Executives: That shit's hard core...you are not TR.)&lt;/p&gt;

      &lt;p&gt;I will still probably buy an iPhone some day because they are very
      cool. However, I will never develop for it, because I'm a crazy one. A
      misfit. And I'm not fond of rules.&lt;/p&gt;

      &lt;p&gt;Now where did I pick up that idea?&lt;/p&gt;
    </content>
  </entry>

  <entry>
    <title>Why Engineers Hop Jobs</title>
    <link href="http://teddziuba.com/2010/05/why-engineers-hop-jobs.html"/>
    <updated>2010-05-01T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2010/05/why-engineers-hop-jobs</id>
    <content type="html">
      &lt;p&gt;&lt;img
      src=&quot;/images/why-is-it-that-militant-atheists-never-say-anything-bad-about-jews.jpg&quot;
      class=&quot;mt-image-right&quot;&gt;What's with all the hate on my generation? It started when somebody
      quit Jason Calacanis's industrial web spam startup, Mahalo, for a higher
      paying position at a competitor. Invariably, Calacanis went apeshit on
      the poor guy in a very public way, and this started a cascade of blogosphere
      butthurt about people in software under thirty: that we're unreliable,
      that we're lazy, that we're entitled.&lt;/p&gt;

      &lt;p&gt;Well I'm as unreliable, lazy, and entitled as the next guy, but that's not
      why I've hopped jobs in the past. People in my generation have a very
      low tolerance for bullshit, and software engineering, in general, is a
      very high bullshit career. If you couple that with the standard load
      of bullshit you would get from a non-technical Harvard MBA type boss &amp;mdash;
      like many CEOs that you find trying to get rich in Silicon Valley by
      hiring some engineers to &quot;code up this idea real quick&quot; &amp;mdash; it's no
      wonder that a good engineer will walk off the job after his one year
      cliff vesting.&lt;/p&gt;

      &lt;p&gt;As an engineer, you are told that you're &quot;lucky to have a job&quot;, because there are &quot;a hundred people lined up
      outside, ready to take it&quot;. (As chance would have it, there are at
      least a thousand lined up to take the job of &lt;em&gt;rich prick who tells
      people what to do&lt;/em&gt;). This backlash is the product of diseased
      thinking. A CEO who makes an engineer work 80 hours a week is a driven
      entrepreneur, but an engineer asking for a comfy chair is a prima
      donna. So, when we are up to our knees in golf-course, martini-lunch
      bullshit, don't be surprised when we jump ship for a higher
      salary.&lt;/p&gt;

      &lt;p&gt;I recognize the value of business people and
      management. Somebody has to sell the code that I write, which in turn
      puts food on my table. Since I &lt;em&gt;am&lt;/em&gt; an engineer, I like
      iterative optimization. Every time I have left a job, I have
      further refined the requirements that a person must fill before I agree to work for him. After every job, I add one or two requirements to the list, and
      I have found that my happiness at work improves dramatically with
      every step.&lt;/p&gt;

      &lt;p&gt;This is my current list:

      &lt;ul&gt;
      &lt;li&gt;The organization must need me at least as much as I need it.&lt;/li&gt;
      &lt;li&gt;My direct manager must have a technical background &amp;mdash; enough to understand why programming is hard.&lt;/li&gt;
      &lt;li&gt;My direct manager must have enough experience or raw intelligence such that I can trust him/her to make decisions, even though I may not understand the reasoning.&lt;/li&gt;
      &lt;li&gt;I must have absolute faith in the business plan.&lt;/li&gt;
      &lt;li&gt;I must have absolute faith in &quot;the business side&quot; to execute that plan.&lt;/li&gt;
      &lt;/ul&gt;
      &lt;/p&gt;

      &lt;p&gt;So, Jason, when that fellow quit Mahalo, he didn't just leave you
      in the lurch. He added something to his list. Maybe you should find
      out what that is.&lt;/p&gt;
    </content>
  </entry>

  <entry>
    <title>Blog Upgrade</title>
    <link href="http://teddziuba.com/2010/04/blog-upgrade.html"/>
    <updated>2010-04-04T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2010/04/blog-upgrade</id>
    <content type="html">
      &lt;img src=&quot;/images/government-bailouts-prove-that-investment-banking-is-now-graded-on-a-curve.jpg&quot; class=&quot;mt-image-right&quot;&gt;

      &lt;p&gt;So that's it, I'm finally done with Movable Type. I had upgraded my
      4.x install to the newest 5.x, and the process was nothing but a
      colossal fuckup. After the fact, my site was compromised 4 times by
      my last count &amp;mdash; some bot set up a handful of phishing pages
      here. RSA security caught it, notified Dreamhost, and they shut me down a
      couple of times.&lt;/p&gt;

      &lt;p&gt;Anyhow, rather than figure out the attack vector with Movable Type,
      I decided to scrap it and
      use &lt;a href=&quot;http://github.com/mojombo/jekyll&quot; rel=&quot;nofollow&quot;&gt;Jekyll&lt;/a&gt;. It only took me a day and a
      couple of angry Python scripts to migrate all my shit to Jekyll from
      MT. Comments are still off because I don't care in the slightest
      what people have to say, and certainly not enough to slow my stuff
      down with Javascript.&lt;/p&gt;

      &lt;p&gt;One of the reasons I used MT in the first place was that it
      generated static HTML pages for all posts, instead of doing
      something silly like querying a database to generate what amounts to
      static content. Because of this, my pages load (first request to
      final render) in about 300
      milliseconds on my home connection. For comparison, techcrunch.com
      can take upwards of 1 minute from first request to final render.&lt;/p&gt;

      &lt;p&gt;In this regard, Jekyll feels right. I can keep everything under
      version control, the templating is only marginally braindead, and
      the publishing step is rsync. After using Jekyll, I feel like every
      other blogging engine out there is telling me, &lt;em&gt;&quot;You'll shoot
      your eye out, kid!&quot;&lt;/em&gt;.&lt;/p&gt;

      &lt;p&gt;It's also come to my attention that the re-do with Jekyll has
      caused my posts to show up afresh in all of your Google Reader
      accounts. This was unintentional, but a nice benefit. It's true,
      this is the greatest web site on the internet, and everything you
      need to know, you can find out here.&lt;/p&gt;

    </content>
  </entry>

  <entry>
    <title>I Can't Wait for NoSQL to Die</title>
    <link href="http://teddziuba.com/2010/03/i-cant-wait-for-nosql-to-die.html"/>
    <updated>2010-03-04T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2010/03/i-cant-wait-for-nosql-to-die</id>
    <content type="html">
      &lt;img alt=&quot;trolling-chatroulette-with-pictures-of-suicide-scenes-will-never-stop-being-funny.jpg&quot; src=&quot;/images/trolling-chatroulette-with-pictures-of-suicide-scenes-will-never-stop-being-funny.jpg&quot; width=&quot;335&quot; height=&quot;224&quot; class=&quot;mt-image-right&quot; style=&quot;float: right; margin: 0 0 20px 20px;&quot; /&gt;They don't teach you this in college, but the fundamental theorem of the software industry is the idea that everything needs to be rewritten all the time. &amp;nbsp;As a corollary, web startup engineers believe that there is no problem but scalability, &amp;nbsp;and architecture is its solution. And thus, the NoSQL movement was born.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The idea is that object relational databases like MySQL and PostgreSQL have lapsed their useful lifetimes, and that document-based or schemaless databases are the wave of the future. Never mind of course that MySQL was the perfect solution to everything a few years ago when Ruby on Rails was flashing in the pan. Never mind that &lt;i&gt;real&lt;/i&gt;&amp;nbsp;businesses track all of their data in SQL databases that scale just fine. (For Silicon Valley readers, Walmart is a &lt;i&gt;real business&lt;/i&gt;, Twitter is not.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Invariably, all web projects start off with something like Rails or Django, most likely backed by MySQL. The data relationships are easy to model, and the application works well. &amp;nbsp;If you are lucky enough that people actually &lt;i&gt;use&lt;/i&gt;&amp;nbsp;your application, eventually you will start to see some performance issues. At this point, a developer who values technological purity over gettin' shit done will advocate &quot;rewriting the whole thing in a weekend using Cassandra&quot;. &amp;nbsp;And if he's smart enough, he might just pull it off. (Of course, said developer has only migrated the &lt;i&gt;app&lt;/i&gt;&amp;nbsp;to use a different data store - all of the ancillary support code was conveniently ignored)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So you've magically changed your backend from MySQL to Cassandra. Stuff will just work now, right? Well, no. Did you know that Cassandra requires a restart when you change the column family definition? Yeah, the MySQL developers actually had to think out how ALTER TABLE works, but according to Cassandra, that's a hard problem that has very little business value. Right.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'm not just singling out Cassandra - by replacing MySQL or Postgres with a different, new data store, you have traded a well-enumerated list of limitations and warts for a newer, poorly understood list of limitations and warts, and &lt;i&gt;that&lt;/i&gt;&amp;nbsp;is a huge business risk.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;img alt=&quot;one-skill-all-junior-developers-lack-is-knowing-how-to-tell-your-boss-to-fuck-off.jpg&quot; src=&quot;/images/one-skill-all-junior-developers-lack-is-knowing-how-to-tell-your-boss-to-fuck-off.jpg&quot; width=&quot;250&quot; height=&quot;287&quot; class=&quot;mt-image-left&quot; style=&quot;float: left; margin: 0 20px 20px 0;&quot; /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;You Are Not Google&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;The sooner your company admits this, the sooner you can get down to some real work. &amp;nbsp;Developing the app for Google-sized scale is a waste of your time, plus, there is no way you will get it right. Absolutely none. It's not that you're not smart enough, it's that you do not have the experience to know what problems you will see at scale.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Besides, did you know that Google Adwords is &lt;a href=&quot;http://en.wikipedia.org/wiki/AdWords#Technology&quot;&gt;implemented on top of MySQL&lt;/a&gt;? &amp;nbsp;What, that business critical code that operates at massive scale doesn't use BigTable? No, in fact there is such enormous value in sticking with what works that Google identifies problems with InnoDB at scale and submits patches, instead of saying &quot;MySQL doesn't scale, let's dump it for something else&quot;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;NoSQL will never die, but it will eventually get marginalized, like how Rails was marginalized by NoSQL. &amp;nbsp;In the meantime, DBAs should not be worried, because any company that has the resources to hire a DBA likely has decision makers who understand business reality.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-size: 0.8em; &quot;&gt;Top photo credit &lt;a rel=&quot;nofollow&quot; href=&quot;http://www.paulrussell.info/&quot;&gt;Paul Russell&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Eventlet: Asynchronous I/O for Grownups</title>
    <link href="http://teddziuba.com/2010/02/eventlet-asynchronous-io-for-g.html"/>
    <updated>2010-02-11T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2010/02/eventlet-asynchronous-io-for-g</id>
    <content type="html">
      &lt;img alt=&quot;lose-an-argument-like-a-man-say--well-i-guess-ill-just-go-fuck-myself-then.jpg&quot; src=&quot;/images/lose-an-argument-like-a-man-say--well-i-guess-ill-just-go-fuck-myself-then.jpg&quot; width=&quot;250&quot; height=&quot;188&quot; class=&quot;mt-image-right&quot; style=&quot;float: right; margin: 0 0 20px 20px;&quot; /&gt;Event-driven asynchronous I/O is the newest chatter at the Silicon Valley High Abercrombie table. &amp;nbsp;Threading, the mode of parallelism we all thought we were so smart for understanding, isn't cool anymore. Everybody who is anybody is using asynchronous I/O, and of course, there are different opinions on how it should be done. This being the software world, you can count on those opinions being vehement.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If you look at the benchmarks, all of the major async libraries for Python are basically on the same operating plane. There's Twisted, Tornado, gevent, and a handful of others, but the one that really stands out in the group is &lt;a href=&quot;http://eventlet.net/&quot;&gt;Eventlet&lt;/a&gt;. Why is that? Two reasons:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1. &lt;b&gt;You don't need to get balls deep in theory to be productive with Eventlet.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;2. &lt;b&gt;You need to modify very little pre-existing code to adapt a program to be event-driven.&lt;br /&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Eventlet's approach is that &lt;i&gt;asynchronous code should look like synchronous code&lt;/i&gt;. Why? Because it's easy for people to understand synchronous code. &amp;nbsp;Thinking about callbacks and schedulers is unnecessary, after all, we have work to do. What's more, not only does asynchronous code with Eventlet &lt;i&gt;look&lt;/i&gt; synchronous, it can also &lt;i&gt;run&lt;/i&gt; synchronously.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Look at this Python snippet:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;

      &lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;fetch_and_parse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;contents&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urllib2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;urlopen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;tree&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lxml&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;html&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fromstring&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;contents&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;c&quot;&gt;# Do some parsing on the ElementTree&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
      &lt;/pre&gt;&lt;/div&gt;

      It looks like regular synchronous code, and ostensibly it is. The output of the URL fetch is the input to the HTML parser. However, if you have a ton of URLs to do this to, how would you parallelize it? Threads are an option, but so is Eventlet:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
      &lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;eventlet&lt;/span&gt;
      &lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;eventlet.green&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urllib2&lt;/span&gt;

      &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;green_pool&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eventlet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GreenPool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;green_pool&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;imap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fetch_and_parse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;

      This is interesting because all I've done to make a seemingly synchronous piece of code run asynchronously is to patch the library it needs for I/O and give it a driver method. That driver class could have easily been a series of threads all reading from a Queue, and importing the standard library's version of urllib2.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now hold on a second. This is a painfully contrived example, but it's such a key point: The asynchronous code looks synchronous. It can even function synchronously. All I did to make it use event-driven I/O is &lt;b&gt;change the driver and patch a library&lt;/b&gt;. Now this is podracing!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;That sort of integration has such a massive business value that I will easily disregard any pissing-contest performance gains that Twisted or Tornado may offer. I know that when you have code written in the &quot;old&quot; style, and the powers that be hand down the &quot;new&quot; style, there is an itch to re-write it, but rewriting known-working code is the worst thing you can do for your project.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The Eventlet developers have gone further than this, providing a facility to monkey-patch the existing system libraries at invocation time. For example, let's say you have a web app that does some Memcached I/O and some database I/O.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;

      &lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;eventlet&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;patcher&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;patcher&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;monkey_patch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;all&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;/pre&gt;&lt;/div&gt;

      Oh look. Your application is now using asynchronous I/O. This call patches Python's socket module and a few others to make it all &quot;just work&quot; with Eventlet's internal coroutine switching mechanism. (Caveat: MySQLdb, which uses C-land sockets, needs a little bit of extra treatment, but it's only a couple of lines)&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This all sounds great in theory, but I have actually made a large I/O bound program work using monkey patching and changing the driver. It is a piece of software that reads jobs from a queue and processes them, putting the result in memcached. For esoteric reasons I will not go into, the job processors could not thread the work, they had to fork. Using this setup, one production box with 8GB of RAM was consistently 7.5GB full. After a less than 5 line code change to the driver, that same production box uses only around 1GB of RAM consistently, and can handle 5 to 10x the throughput of the old system.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now compare this to Twisted or Tornado. Twisted tries so damn hard to be Java that it really offends me personally. Those developers strike me as the alpha-programmer types who see no reason &lt;i&gt;not&lt;/i&gt; to rewrite an existing codebase for a 20% performance gain. &amp;nbsp;Tornado on the other hand is significantly less Jersey Shore douchebaggy, but they still miss the point: we are programmers who need to get stuff done. Inventing your own HTTP client class, when Python's builtin works just fine if not better is the type of hubris that gets hotshot programmers fired in their first month.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There's also gevent, which appears to be a fork of Eventlet, but is not as well documented. Partial credit.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's hard to find a performance or scaling related open source library that values my time. Eventlet is one of those rare few.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href=&quot;http://eventlet.net/&quot;&gt;http://eventlet.net&lt;/a&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Break My Concentration and I Break Your Kneecaps</title>
    <link href="http://teddziuba.com/2010/01/break-my-concentration-and-i-b.html"/>
    <updated>2010-01-24T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2010/01/break-my-concentration-and-i-b</id>
    <content type="html">
      &lt;img alt=&quot;a-handgun-is-like-an-atm-machine-and-convincing-argument-all-in-one.jpg&quot; src=&quot;/images/a-handgun-is-like-an-atm-machine-and-convincing-argument-all-in-one.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; width=&quot;279&quot; height=&quot;279&quot; /&gt; &lt;div&gt;I own a good set of headphones that fully enclose my ears. I am not an audiophile, I just don't like to hear other people talk at me. &amp;nbsp;When I am staring at my Emacs windows with headphones on, it generally isn't a physical cue that I am looking for conversation. In fact, when I am that deep into thinking out a problem and I get interrupted, I think about the anti-workplace-violence clause in the employee handbook, and how a poorly lit parking lot probably doesn't qualify as &quot;company property&quot;.&lt;br /&gt;&lt;br /&gt;Interrupting a thinking programmer is a sucker punch to productivity's kidney. Of course it's still important to keep open communication channels, especially in a small team. I don't mind answering questions and helping out, so long as it's not an immediate context switch for me, i.e. I'll help you if I don't have to speak.&lt;br /&gt;&lt;br /&gt;Instant messaging is a decent first attempt, but it's only person-to-person communication. (And no, group-IM &lt;i&gt;never&lt;/i&gt; fucking works right) Programming teams need group chat.&amp;nbsp; White-label Twitter clones like Yammer are okay, but I feel icky using a product that is hailed as a technological advance for supporting the ability to identify topics by prefixing a word with a pound sign. That, and I want to keep an eye on the conversation as I work, and my attention isn't on my IM client or browser when I'm coding. It's on Emacs. &lt;br /&gt;&lt;br /&gt;The answer, of course is IRC.&lt;br /&gt;&lt;br /&gt;My team recently grew, and four of us need to communicate constantly. I set up an IRC server and brought people in. One non-programmer who needed to be in the loop had never used IRC, but caught on quickly. Productivity is up, as is communication. The developer chat channel is right in front of me as I work, as a window in Emacs:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;at-the-crunchies-i-got-drunk-and-started-heckling-people-who-used-to-be-important.png&quot; src=&quot;/images/at-the-crunchies-i-got-drunk-and-started-heckling-people-who-used-to-be-important.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; width=&quot;576&quot; height=&quot;317&quot; /&gt;Think of developer communication like I/O. There's blocking and nonblocking. When somebody talks to me as I work, my programming train of thought needs to block. With inline chat like you see above, I can answer questions when I have spare cycles. Since the conversation is integrated into my development environment, I don't need to look around at other applications, and there's no popup notification bouncing around like a Jack Russell terrier who got into my Adderall supply. Also since it's Emacs, it's not vim. If you use vim, /quit #life.&lt;br /&gt;&lt;br /&gt;Collaboration technology doesn't need to be re-invented every six years. The stuff we had in the eighties works just fine.&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Options for Parallel Compression</title>
    <link href="http://teddziuba.com/2010/01/options-for-parallel-compressi.html"/>
    <updated>2010-01-15T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2010/01/options-for-parallel-compressi</id>
    <content type="html">
      &lt;img alt=&quot;when-a-couple-gets-a-dog-its-like-saying-we-want-a-baby-but-dont-want-to-go-to-jail-if-it-dies-by-accident.jpg&quot; src=&quot;/images/when-a-couple-gets-a-dog-its-like-saying-we-want-a-baby-but-dont-want-to-go-to-jail-if-it-dies-by-accident.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;255&quot; width=&quot;288&quot; /&gt;At Milo, I pretty frequently need to pull data down from production to my workstation to test some new code. That's what happens when you raise a Series A round - you can't live-edit production data anymore. I think it's in the term sheet somewhere.&lt;br /&gt;&lt;br /&gt;Anyhow, I was pulling down a 14GB MySQL database dump today. Trying to compress it through plain Jane gzip was pretty slow, so I looked for some parallel options. The server I was pulling from has 16 cores, so I figured I could make use of them.&amp;nbsp; Anyhow, here's what I found:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;http://compression.ca/pbzip2/&quot;&gt;pbzip2 - Parallel BZIP2&lt;/a&gt;: Parallel implementation of BZIP2. BZIP2 is well known for being balls slow, so speed it up using multiple CPUs.&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://www.zlib.net/pigz/&quot;&gt;pigz - Parallel GZIP&lt;/a&gt;: Parallel implementation of GZIP written by Mark Adler (guy who co-authored zlib and gzip, so you can be reasonably confident he has his shit together).&lt;/li&gt;&lt;/ul&gt;On the 14GB database dump, both are faster than vanilla GZIP. Because Hacker News and Reddit both love this shit, here are the timing stats:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Plain gzip, default compression level: 11 minutes, 58 seconds. Resultant file is 2.3GB.&lt;/li&gt;&lt;li&gt;pbzip2, default compression level: 8 minutes, 48 seconds. Resultant file is 1.7GB.&lt;/li&gt;&lt;li&gt;pigz, default compression level: 1 minute, 33 seconds. Resultant file is 2.3GB.&lt;/li&gt;&lt;/ul&gt;Again this was on a 14GB database dump file, on a 16-core machine, with Intel solid state disks.&lt;br /&gt;&lt;br /&gt;If any readers know of other parallel compression schemes I can try, e-mail me and let me know. I will post stats here.&lt;br /&gt; &lt;div&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>I Love the GPL (Except When it Applies to Me)</title>
    <link href="http://teddziuba.com/2010/01/i-love-the-gpl-except-when-it.html"/>
    <updated>2010-01-02T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2010/01/i-love-the-gpl-except-when-it</id>
    <content type="html">
      &lt;img alt=&quot;if-red-wine-and-hybrid-cars-were-made-from-animals-there-would-be-no-more-vegans.jpg&quot; src=&quot;/images/if-red-wine-and-hybrid-cars-were-made-from-animals-there-would-be-no-more-vegans.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;360&quot; width=&quot;263&quot; /&gt; &lt;div&gt;Boy do I love free software. It is usually pretty high quality, I don't have to pay for it, and I feel completely justified in criticizing the maintainers on public mailing lists for not supporting the exact features I need.&amp;nbsp; Of course I'm not going to send patches back, because it's just way easier to bitch and moan. &lt;br /&gt;&lt;br /&gt;Also, since my software product is a web service, I have exactly zero obligation to contribute anything back to the community, ever. Sure, I may use some GPLed software, but shit, actually following the spirit of the copyleft? Don't they know this is a business, not a charity? Fuck that noise.&lt;br /&gt;&lt;br /&gt;I came up in the salad days of Slashdot, when the cast of villains and henchmen included Microsoft, SCO, and anyone else who wanted to turn a dime from software. We believed in the GPL, that a viral copyleft clause was good for humanity. That is, until we left academia and had to pay the rent.&lt;br /&gt;&lt;br /&gt;Since the world appears to be moving toward software as a service (against my sage advice, mind you), it is blisteringly easy to be a champion of the ideals behind open source and free software, but still pussyfoot around when it comes to execution.&amp;nbsp; What I'm talking about is the loophole in the GPL that exempts application service providers from having to release their derivative works under the same license as the libraries.&lt;br /&gt;&lt;br /&gt;The pedantic reader who is going to talk shit will point out the difference between &lt;i&gt;open source&lt;/i&gt; and &lt;i&gt;free&lt;/i&gt; software. So, before you write a blog post that nobody's going to read, allow me to demonstrate.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Open Source&lt;/b&gt;: I want to let others use my code in whatever manner they please, and not be bound by an anti-commercial license.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Free Software&lt;/b&gt;: I found a loophole in my student loan documentation that lets me defer payments for decades, so long as I stay in the Ph.D. program!&lt;br /&gt;&lt;br /&gt;If anything good comes out of Web 2.0, it's the malignant tumor on the GPL's kidney, still wrongly diagnosed as a urinary tract infection.&lt;br /&gt;&lt;br /&gt;Back in the Slashdot days, we all thought that the fate of free software would be decided by a landmark court decision, that if the ideals of the GPL were to die, they would wind up meeting a ceremonious end like the cabinet members of a government overthrown in a military coup. But no - the free software ideal will die by the hands of a thousand poseurs, all who want the notoriety of contributing to open source, but none who are convicted enough to release any of their business's core code under a free license.&lt;br /&gt;&lt;br /&gt;The copyleft will share the same fate as the hippie movement, now only a shell of its former self supported by college age kids who hang out in the Haight-Ashbury and smoke pot all day, and at night, drive their Lexuses over the Golden Gate, back to Marin County. But you will take off that damn Che Guevara shirt before you come back into my house, young man.&lt;br /&gt;&lt;br /&gt;Look at all of the open source software in modern use. The vast majority of it is licensed under terms without a copyleft clause. The BSD license, Apache license, MIT license, and a handful of others are the most prevalent. In some places, the GPL still kicks around, but since we are application service providers, we are all free to ignore it. &lt;br /&gt;&lt;br /&gt;The Affero General Public License, a version of the GPL that closes the service-provider loophole, is almost nowhere to be found. The only new-hotness software I know of that is licensed under Affero is MongoDB, and even they have a chickenshit implementation - they have structured the code such that the 99% case of a web application using Mongo is effectively bound by the Apache license.&lt;br /&gt;&lt;br /&gt;Affero-licensing your project is a fatal defect if you want it to be used. Since the current flow of the software industry has effectively neutered the GPL, the only serious chance the copyleft has is the Affero license, and that sure-as-shit ain't gonna happen.&lt;br /&gt;&lt;br /&gt;The toll on the Golden Gate Bridge is now six dollars.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>How I Spot Valuable Engineers</title>
    <link href="http://teddziuba.com/2009/12/how-i-spot-valuable-engineers.html"/>
    <updated>2009-12-14T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2009/12/how-i-spot-valuable-engineers</id>
    <content type="html">
      &lt;img alt=&quot;hire-women-at-a-startup-because-an-office-full-of-young-men-will-live-in-their-own-filth-until-an-investor-shows-up-for-a-tour.jpg&quot; src=&quot;/images/hire-women-at-a-startup-because-an-office-full-of-young-men-will-live-in-their-own-filth-until-an-investor-shows-up-for-a-tour.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;379&quot; width=&quot;325&quot; /&gt; &lt;div&gt;Goofy stuff happens when your company announces a funding round. We've gotten walk-in solicitors who try to sell us networking equipment, pitches for I-can't-quite-see-the-scam-but-I'm-sure-it's-there-somewhere stock exchange programs, and phone calls from slick Oracle salesmen who have their get-past-the-secretary sneak perfected so well that they could probably make better livings as industrial spies.&lt;br /&gt;&lt;br /&gt;But most frequently, there are resumès that land in my inbox. Yes Milo is hiring, and a lot of people contact me directly instead of the &quot;jobs&quot; address, which I can sympathize with because I've always had this feeling that &quot;jobs@&quot; e-mail addresses are black holes where career dreams get sent to die.&lt;br /&gt;&lt;br /&gt;Our general workflow for hiring engineers is to send the person our &quot;engineering challenge&quot; programming question and see how they do on it. If that looks good, they come in for interviews. I don't like doing interviews because I've always got enough stuff to do, but sometimes it's a good break. Necessary evil, I guess. Like Katy Perry. Have you &lt;i&gt;heard&lt;/i&gt; a live performance? Ph33r. &lt;br /&gt;&lt;br /&gt;Anyhow, when I interview a candidate, I'm trying to determine how &lt;b&gt;valuable&lt;/b&gt; the candidate is, not just how smart he or she is.&amp;nbsp; Because I love English semantics:&lt;br /&gt;&lt;br /&gt;A &lt;b&gt;smart&lt;/b&gt; candidate will do well on the engineering challenge problem.&lt;br /&gt;A &lt;b&gt;productive&lt;/b&gt; candidate will be able to explain past projects in detail.&lt;br /&gt;A &lt;b&gt;valuable&lt;/b&gt; candidate is smart and productive, but also has useful knowledge gained from experience.&lt;br /&gt;&lt;br /&gt;To tell if a candidate is valuable, you need to piss them off. (By the way, does it make you feel icky that &lt;i&gt;they&lt;/i&gt; can be used with a singular antecedent? This derelict language is put together with duct tape and baling wire, I swear.)&amp;nbsp; A valuable candidate will likely have been personally offended by some sequence of bullshit thrown from a programming language, tool, library, or problem in past work. This is the kind of bullshit-train I'm talking about.&lt;br /&gt;&lt;br /&gt;Need to parse XML with Python → SGMLlib feels like a kids toy → Implement it with BeautifulSoup → Fuck me, Soup is too slow → Re-implement with LXML → LXML works great for months → LXML segfaults the Python interpreter when used in a threaded environment under heavy load → &lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;hey-look-another-4chan-meme-that-reddit-has-bludgeoned-with-the-bat-of-unoriginality.JPG&quot; src=&quot;/images/hey-look-another-4chan-meme-that-reddit-has-bludgeoned-with-the-bat-of-unoriginality.JPG&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;116&quot; width=&quot;154&quot; /&gt;&lt;br /&gt;&lt;/div&gt;
      Any developer who has been around enough to accumulate valuable experience will have his personal collection of stories that have mad him rage. I have been burned by bugs in programming language implementations, bugs I call &quot;coding slurs&quot;. I have gotten the shaft more times than I can count from pathological character set issues that make me want to run for Congress on the platform of requiring licenses before people are allowed to use computers.&amp;nbsp; If you really want to find the value in a job candidate - find out what pisses him off.&lt;br /&gt;&lt;br /&gt;The easiest way I have found of doing this is to ask a candidate &lt;i&gt;&quot;what don't you like about your favorite programming language?&quot;&lt;/i&gt; You can grade their experience with the response. For example:&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What don't you like about Java?&lt;/i&gt;&lt;br /&gt;&lt;b&gt;Out-of-college answer:&lt;/b&gt; &quot;Java is too verbose&quot;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Battle-hardened developer answer:&lt;/b&gt; &quot;Object storage is aligned on a 64-bit boundary, at least in Sun's JVM, so if you need to allocate a lot of small storage, you really need to know JVM internals so you don't run out of memory.&quot;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What don't you like about Python?&lt;/i&gt;&lt;br /&gt;&lt;b&gt;Answer from a candidate who will write frameworks for solving problems instead of getting shit done:&lt;/b&gt; &quot;Dynamic typing means you need to rely more on your tests and less on the interpreter to make sure your code is correct&quot;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Answer from Gunnery Sergeant Hartman, your senior drill instructor:&lt;/b&gt; &quot;Independent object cycles where one of the objects has a __del__ method don't get garbage collected.&quot;&lt;br /&gt;&lt;br /&gt;People think I hate programming. Nope. What I hate is fording endless rivers of horseshit that are in the way of seemingly simple tasks. And I hate it even more when I have to explain to a non-programmer what I am doing, &quot;building LXML against a different version of libiconv because I think it might be the source of a crash&quot;. &lt;br /&gt;&lt;br /&gt;&quot;But all I asked you to do was parse some documents.&quot;&lt;br /&gt;&lt;br /&gt;Good times. &lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>Introducing Milo</title>
    <link href="http://teddziuba.com/2009/11/introducing-milo.html"/>
    <updated>2009-11-24T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2009/11/introducing-milo</id>
    <content type="html">
      &lt;img alt=&quot;oh-good-lord-i-hope-the-servers-stay-up-today.jpg&quot; src=&quot;/images/oh-good-lord-i-hope-the-servers-stay-up-today.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;301&quot; width=&quot;394&quot; /&gt;I had mentioned a couple of months ago that I had my head down into a new project. It's been an open secret that the project is &lt;a href=&quot;http://milo.com/&quot;&gt;Milo.com&lt;/a&gt;, an online local comparison shopping engine.&amp;nbsp; We index the inventory of stores nationwide and show you real-time, what is available around you. From an engineering perspective it's a cool problem because there is a lot of data to store and manage, as well as a lot of integration work to deal with the particular temperament of various retailers' inventory systems. Of course if it were easy, someone would have done it already.&lt;br /&gt;&lt;br /&gt;From a business perspective, I like it a lot. The online comparison shopping world is very crowded, and we didn't want to be just another me-too AdWords arbitrage/affiliate marketing site. When I got into this business, I thought that online shopping was like the Stairway to Heaven of Internet business, but with the local inventory lookup, I think we really have distinguished ourselves from the others out there.&lt;br /&gt;&lt;br /&gt;Today I'm happy to announce that we've closed a $4 million Series A investment round, led by True Ventures, with other investors such as Ron Conway, Aaron Patzer, and Jeff Clavier also participating.&amp;nbsp; As a side note, I was really impressed by the True team, and am happy to be working with them. There were ups and downs to the Series A process, and I have to say that pitching the True partners was a definite up.&lt;br /&gt;&lt;br /&gt;Oh, right. We also have a mascot. His name is Milo, of course. Here he is attacking me at my desk:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;what-is-it-about-dogs-and-face-licking.jpg&quot; src=&quot;/images/what-is-it-about-dogs-and-face-licking.jpg&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;320&quot; width=&quot;240&quot; /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Hey Lets Bitch About SEO Again</title>
    <link href="http://teddziuba.com/2009/10/hey-lets-bitch-about-seo-again.html"/>
    <updated>2009-10-13T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/10/hey-lets-bitch-about-seo-again</id>
    <content type="html">
      &lt;img alt=&quot;cant-we-go-back-to-debating-if-google-is-evil-for-doing-business-in-china.jpg&quot; src=&quot;/images/cant-we-go-back-to-debating-if-google-is-evil-for-doing-business-in-china.jpg&quot; width=&quot;281&quot; height=&quot;375&quot; class=&quot;mt-image-right&quot; style=&quot;float: right; margin: 0 0 20px 20px;&quot; /&gt; &lt;div&gt;Hey I have an awesome idea. Let's take a field of business that many people work in to make a legitimate living, and tear it down for being immoral and accuse it of fraud. &amp;nbsp;And when it comes to solving the actual problem that this business works on, apply a nice helping of sunshine-up-your-ass, and everything's just fine.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Better yet, let's do this every six to eight months, because collectively we have the attention span of a fruit fly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Well, it's been a solid eight months, and somebody kicked the hornet's nest. Is SEO good or evil? &amp;nbsp;It's good. It's great. I &amp;lt;3 SEO.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When you hire a legitimate white hat SEO, you are paying for domain knowledge. Is it better to use dashes or underscores to separate keywords in a URL? I know the answer, but I've spent some time researching SEO. If I were, say, an online publisher, it would be worth money to hire somebody who knows the answer to this question and a pop-quiz full of other questions that isn't in your average web developer's job description.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Every hit on SEO eventually ends with the same solution. &quot;Just write good content or make a good web app, and the traffic will come.&quot; &amp;nbsp;Oh really, it's just that simple, eh? How many unpublished novelists are there out there? How many film students whose reels go unwatched? Google is the greatest media distribution channel that there has ever been, and you expect people not to look for every advantage they can get?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the failure with the &quot;make a good app, people will come&quot; argument. Let's say you are making an application whose target market is one person in ten. That's a&amp;nbsp;respectably sized market. &amp;nbsp;You tell your friends, your family, people you know through the internet. You write on your personal blog about it. &amp;nbsp;Let's say you reach 1,000 people, generously. &amp;nbsp;If your hit rate within that market is 50%, that's 50 people you've got who haven't immediately dumped your app. Do they care enough about it to do your marketing for you? &amp;nbsp;With that small of a user base, you don't have statistically significant feedback to improve the site, you've got to gun it on intuition, which is frequently wrong.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So there you are, with your 50 users, and since you don't have to spend any time or money on distributing your app (remember these 50 people will do it for you), then you can continue to develop the app, making it &quot;better&quot;, as you see it, in a vacuum. &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And let's just count on those 50 people bringing in 10 million of their closest friends in the next month or so.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Hell no. This is the internet, son. Kill or be killed. &amp;nbsp;If you can spend some money on a good SEO who will bring a steady flow of traffic to your site, then you have a way better chance than with that initial set of 50. &amp;nbsp;With search engine traffic, even if you're only getting a handful of traffic every day, it's a different handful. &amp;nbsp;If you have built something of value, some percentage of users will recognize this, and maybe tell a friend, maybe they'll come back to your site, and maybe they'll link to you, but you have a continuous stream of people to try it out on.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Obviously there are shysters in SEO. &amp;nbsp;Going to an SEO who guarantees that you'll rank in the top 10 for mesothelioma is like taking your car to the dealership to get fixed. Of &lt;i&gt;course&lt;/i&gt;&amp;nbsp;you're going to get scammed. Buyer beware, and all that.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, keep debating on whether or not SEO is evil. The rest of us have to find ways to handle our traffic growth.&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>I Don't Code in my Free Time</title>
    <link href="http://teddziuba.com/2009/10/i-dont-code-in-my-free-time.html"/>
    <updated>2009-10-10T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/10/i-dont-code-in-my-free-time</id>
    <content type="html">
      &lt;img alt=&quot;obama-winning-the-nobel-proves-that-white-guilt-is-one-of-the-most-awesome-powers-on-earth.jpg&quot; src=&quot;/images/obama-winning-the-nobel-proves-that-white-guilt-is-one-of-the-most-awesome-powers-on-earth.jpg&quot; width=&quot;210&quot; height=&quot;840&quot; class=&quot;mt-image-right&quot; style=&quot;float: right; margin: 0 0 20px 20px;&quot; /&gt; &lt;div&gt;Why would you ever hire a programmer who doesn't program in his free time? &amp;nbsp;I mean, a person who doesn't compile recreationally is probably useless on the job. &amp;nbsp;You might as well hire somebody ... &lt;i&gt;old&lt;/i&gt;. And who wants a bunch of people around the office who whine about things like &lt;i&gt;healthcare benefits&lt;/i&gt;? Just don't get sick, duh.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I love it when twenty-something engineers take such a hard-line position on something they have so little experience with, like hiring. &amp;nbsp;Saying that you wouldn't hire somebody for a programming job because they don't program in their spare time is blissfully naive. Yeah, I remember the days when my greatest responsibility to another human being was making rent on the first of the month.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If I am going to hire somebody for a programming job, I don't really give a shit &lt;i&gt;what&lt;/i&gt; they do in their spare time, so long as that person is very good at the task at hand. &amp;nbsp;I don't ask questions about what a person does in their free time in job interviews because I don't care, and because that can sometimes open the door to an illegal conversation. (What's that? There are laws about what you can ask somebody in a job interview? Who thought that up, Republicans?)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;background-color: rgb(255, 255, 255); &quot;&gt;Me, I can count on one hand the number of times I've programmed outside of work or a class. &amp;nbsp;There was only once when I actually enjoyed it, though. I was in college, and shared a common wall with a girl from Spain who was painfully unaware that her computer had a volume control knob. She would stay up late on AOL instant messenger, and I couldn't sleep. &amp;nbsp;So, I rigged up a Python script to play AOL instant messenger sounds randomly every 5 to 10 seconds, turned up my speakers, pointed them at the wall, and went on vacation for a week. &amp;nbsp;And thus, the asshole you all know and love is born.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I don't enjoy programming so much as I enjoy the satisfaction I get from cracking hard problems.&lt;span class=&quot;Apple-style-span&quot; style=&quot;background-color: rgb(255, 255, 255); &quot;&gt;&amp;nbsp;In that case, computer code is a means to an end, but so is my Craftsman socket set. &amp;nbsp;I like to spend free time wrenching on a car or a bike, but I don't set out on Saturday morning and say &quot;I'm going to learn how to use a torque wrench today, because those things are the future of tools&quot;. &amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I would not want to work for a company that wouldn't hire me because I don't code in my spare time. Professional development? Working at a startup, I get a heaping helping of that on the job. &amp;nbsp;Keeping up with new technology? Yeah, I read reddit, and again, startup. &amp;nbsp;You know what's more awesome than spending my Saturday afternoon learning Haskell by hacking away at a few Project Euler problems? Fuck, ANYTHING.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Really, why should I bother spending time with my family and taking an active role in my kids' development when there's a dead-beaten math puzzle that doesn't have a good answer in Clojure? &amp;nbsp;&quot;I won't hire someone who doesn't code in their free time&quot; is Siliconvallese for &quot;I don't want to hire any grownups because they remind me of my parents&quot;.&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Twisted vs. Tornado: You're Both Idiots</title>
    <link href="http://teddziuba.com/2009/09/twisted-vs-tornado-youre-both.html"/>
    <updated>2009-09-18T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/09/twisted-vs-tornado-youre-both</id>
    <content type="html">
      &lt;img alt=&quot;plucking-you-unibrow-is-the-most-undignified-type-of-grooming.jpg&quot; src=&quot;/images/plucking-you-unibrow-is-the-most-undignified-type-of-grooming.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;268&quot; width=&quot;390&quot; /&gt; &lt;div&gt;First, a message to bloggers. If you've got the bright idea to try some new kind of benchmark that pits Twisted against Tornado, take pause. Turn off your computer, step into a public area, and reconsider your life's goals. The internet does not need another pointless network performance graph.&lt;br /&gt;&lt;br /&gt;With that out of the way, it's become clear that the Pissing Contest of the Day, Twisted.web vs. Friendfeed's Tornado web framework, reveals that neither side of the argument is particularly right, but both sides are particularly stupid.&lt;br /&gt;&lt;br /&gt;First, Twisted. Now, my company uses Twisted for a small piece of functionality because it was the easiest way that we found to send traffic over different network interfaces on a Linux machine. We never have any problems with it. The only reason I ever need to touch it is to see how something works.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;However, Twisted is probably the douchiest programming library out there. Every time I open up that code, I feel like I've wandered into a late-night bar on the Jersey Shore where everybody's drinking Jager-bombs, and nobody is wearing a shirt.&amp;nbsp; Twisted is a cool network library, but not cool enough to be named &quot;Twisted&quot;.&amp;nbsp; It's the Python programmer's version of Ed Hardy clothing and a baseball cap with the tag still hanging off the side.&amp;nbsp; When I'm digging around in this code and my co-workers ask me what's up, the only appropriate response is &quot;NOT NOW CHIEF. I'M STARTIN' THE FUCKIN' REACTOR.&quot;&lt;br /&gt;&lt;br /&gt;Now you can see why there's so buttsore over Tornado.&lt;br /&gt;&lt;br /&gt;Even though I &lt;a href=&quot;http://teddziuba.com/2009/06/startups-keep-it-in-your-pants.html&quot;&gt;advised&lt;/a&gt; &lt;a href=&quot;http://teddziuba.com/2008/04/im-going-to-scale-my-foot-up-y.html&quot;&gt;against&lt;/a&gt; things like Tornado, Friendfeed still built it. From the graphs I've seen, Tornado is just marginally faster than Twisted at serving concurrent requests. Marginally. Evidently Friendfeed figured that tiny margin was enough justification to waste their time writing something that's been re-written by every developer that gets bored on the job.&amp;nbsp; A Python web framework? My mercy how original. I think that's one of the ending exercises of &quot;Learn Python in 24 Hours&quot;.&lt;br /&gt;&lt;br /&gt;Friendfeed spent a lot of time trying to optimize the queries per second graph, but maybe they should have spent more time optimizing this graph instead:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;intuit-didnt-buy-mint-they-bought-a-license-to-stagnate.png&quot; src=&quot;/images/intuit-didnt-buy-mint-they-bought-a-license-to-stagnate.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;186&quot; width=&quot;561&quot; /&gt;Anyway, when it comes to Twisted vs. Tornado for a Python web framework, I use Django. Why? Because it works, and my time is valuable.&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>30 Helens Agree: You Can't Win Without Failing</title>
    <link href="http://teddziuba.com/2009/09/i-read-fred-wilsons-blog.html"/>
    <updated>2009-09-09T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/09/i-read-fred-wilsons-blog</id>
    <content type="html">
      &lt;img alt=&quot;an-infant-is-a-function-whose-inputs-are-sight-sound-smell-touch-and-taste-and-whose-outputs-are-bodily-fluids.jpg&quot; src=&quot;/images/an-infant-is-a-function-whose-inputs-are-sight-sound-smell-touch-and-taste-and-whose-outputs-are-bodily-fluids.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;270&quot; width=&quot;360&quot; /&gt;I read &lt;a href=&quot;http://www.avc.com/a_vc/2009/09/failure.html&quot;&gt;Fred Wilson's blog post on failure&lt;/a&gt; today, and after I was finished being impressed by his three letter domain name, it really made me think about what I learned from my last failed startup.&lt;br /&gt;&lt;br /&gt;&amp;nbsp;There's the usual Reddit material: don't write your own database, concentrate on the UI, put your users first, other such horse-beaten realities that green engineers understand after being in the field for a few years.&amp;nbsp; A true failure is one that changes your life's philosophy, not one that changes your unit testing strategy.&lt;br /&gt;&lt;br /&gt;What I really learned from the fall of Pressflip is that &lt;b&gt;arrogance is more dangerous than incompetence&lt;/b&gt;.&amp;nbsp; I believed that raw engineering prowess could make up for the complete lack of business experience, a product that really only appealed to the people who build the technology behind it, and an addressable market that could easily be mistaken for roundoff error. Couple that with the youthfully cute thought that Silicon Valley is a meritocracy, and it was only a matter of time. We had build some neat technology behind the scenes, and I was very proud of a few key parts of the system, but in the end, the users just did not come.&lt;br /&gt;&lt;br /&gt;The trouble with this lesson is that it can only be learned the hard way. Arrogant people don't listen to criticism, they just run themselves into the wall.&amp;nbsp; Incompetent people can usually be led in the right direction, even though they may execute their way into the dirt.&amp;nbsp; Arrogance doesn't listen to reason, it only listens to itself.&lt;br /&gt;&lt;br /&gt;For example, an arrogant motorcyclist will ride on the highway at twice the speed of traffic, and no matter how many times he gets pulled over, and he'll keep doing it until he crashes.&amp;nbsp; An incompetent motorcyclist will drop his bike in a U-turn in front of his house, cracking a mirror.&lt;br /&gt;&lt;br /&gt;This failure made me saltier. I now understand why old men have no patience for the modern world.&amp;nbsp; However, it did not let me keep thinking that superior code is the solution to any conceivable problem. I've hunkered down a bit, concentrating on a new project that I really believe will be a winner, and started learning the business realities of a cruel Valley.&lt;br /&gt;&lt;br /&gt;So now, if an investor asks me what I learned from past failures, I won't put him to sleep talking about schema-less versus SQL databases. Instead, I've got a good answer.&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>A Happy Life Without the Whining</title>
    <link href="http://teddziuba.com/2009/08/a-happy-life-without-the-whini.html"/>
    <updated>2009-08-28T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/08/a-happy-life-without-the-whini</id>
    <content type="html">
      &lt;img alt=&quot;is-there-a-word-for-the-feeling-you-get-when-you-realize-three-quarters-of-your-twitter-followers-are-spammers.gif&quot; src=&quot;/images/is-there-a-word-for-the-feeling-you-get-when-you-realize-three-quarters-of-your-twitter-followers-are-spammers.gif&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;311&quot; width=&quot;208&quot; /&gt; &lt;div&gt;There's no better way to waste your time than to talk about politics.&amp;nbsp; For as good as educated people are at acting intellectual, we love to bitch and moan about one side versus the other. &lt;br /&gt;&lt;br /&gt;Politics are potato chips for the enlightened mind.&lt;br /&gt;&lt;br /&gt;I grew up in Connecticut. New Englanders are generally pretty educated people. We keep to ourselves. We vote. We donate money to causes, and for the most part, &lt;i&gt;we shut the fuck up&lt;/i&gt;.&amp;nbsp; Personally, I'm registered to one of the two major parties in the US. (If you tried to guess, you'd probably guess wrong.)&amp;nbsp; I don't get into political arguments because I've got better shit to do. I don't blog about politics because I know that nobody cares what I think. You know what? It's a good life.&lt;br /&gt;&lt;br /&gt;Lately, I've been hearing a lot about this Glenn Beck fellow. I don't know who he is or what he said to get everyone so sore-assed, but I sure as shit don't care. I don't watch CNN or Fox News. I don't have cable TV. I get all my news from my local news channel over the air. No talking heads, no shouting matches, no six-second-attention-span scrolling tickers on the bottom of the screen. In 30 morning minutes, I get a brief summary of what the president said at such and such a meeting the other day, a look at the traffic and weather for the day, and some feel-good community segment. &lt;br /&gt;&lt;br /&gt;The last thing I need on a 40 mile motorcycle ride to work is a head full of piss, thanks to Bill O'Reilly or Keith Olbermann.&lt;br /&gt;&lt;br /&gt;But I can tell you that from the inside, generating butthurt is big business. Every time I've knocked an article out of the park for The Register, there's been a decent troll element to it. Not all trolls succeed, but the ones that hit a nerve really bring in the page views and comments. That's just the IT world. If I could get a job trolling politics, I'd be damn sure to demand a page view bonus. I can't knock the hustle.&lt;br /&gt;&lt;br /&gt;The news networks aren't stupid. They know that viewership increases when people are pissed off.&amp;nbsp; Walter Cronkite delivered facts, but was a crusty old book report of a man for it.&amp;nbsp; I'm sure that all else equal, if national media never figured out how much fucking money there is to be made in keeping people salty, the news would still be a puff of dry air.&lt;br /&gt;&lt;br /&gt;So I don't watch network TV. I don't blog about politics. It's a calm life. I have informed opinions on most issues, but I know that nobody cares what I think, so I keep to myself.&amp;nbsp; Maybe that's why I still have trouble &quot;getting&quot; Twitter.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Stop Using the Word 'We'</title>
    <link href="http://teddziuba.com/2009/08/stop-using-the-word-we.html"/>
    <updated>2009-08-20T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/08/stop-using-the-word-we</id>
    <content type="html">
      &lt;img alt=&quot;theres-something-about-lane-splitting-through-marijuana-smoke-on-a-motorcycle-thats-so-unsettling.jpg&quot; src=&quot;/images/theres-something-about-lane-splitting-through-marijuana-smoke-on-a-motorcycle-thats-so-unsettling.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;240&quot; width=&quot;320&quot; /&gt; &lt;div&gt;Yesterday, I spearheaded a new movement at the office. I stopped using the word &quot;we&quot;, and started to say what I really meant to say.&amp;nbsp; For example, instead of &quot;&lt;i&gt;We&lt;/i&gt; should fix that bug&quot;, I say, &quot;&lt;i&gt;You&lt;/i&gt; should fix that bug&quot;, and good God is it satisfying.&lt;br /&gt;&lt;br /&gt;There are a couple of motivations for this. Firstly, one of the key things I've learned being a for-pay writer is to show some conviction. Secondly, the passive discussions about defects and delegation and responsibility really started to irritate me. Why not just tell it like it is?&lt;br /&gt;&lt;br /&gt;When I worked at Google, I picked up on a really annoying trend in the software industry (or maybe just in Silicon Valley) that I call &quot;fuck-you with a smile&quot;.&amp;nbsp; You never want to outright blame somebody or something, rather, it's best to state the existence of an issue and then ask &quot;the team&quot; to fix it.&amp;nbsp; We should really move that icon ten pixels to the left. We definitely need to fix that concurrency bug. We should probably have that all done before lunch.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Well then, Mr. Manager, you had better get cracking, because I've got some YouTube videos to watch.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I learned that the goal of institutional business is to keep from angrying up the blood at all costs.&amp;nbsp; A productive employee is one whose personality has been bleached out to a yellow tinge.&amp;nbsp; Always non-confrontational, never suggesting that any one person fucked up.&lt;br /&gt;&lt;br /&gt;The best part about working at a startup is that I'm free to suggest that yes, you fucked this up. Yes, it's your fault, and yes, you need to fix it. Delegate! Don't waste time listing out action items, spend time telling people what to do. Everyone you work with should be a grown up, and can handle it. The other side of that is owning up to your mistakes. Instead of &quot;There is memory leak in the code, we should prioritize it over other defects&quot;, say, &quot;I introduced a memory leak in the code. I am going to fix it as soon as possible.&quot;&lt;br /&gt;&lt;br /&gt;Anyway, I'm going to keep this up until somebody openly calls me an asshole. You should try it too.&amp;nbsp; You don't have to be a prick about it, just be assertive. Your co-workers will be impressed at your new found confidence. It might even get you laid.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Well, probably not, but you won't be wondering when a meeting is going to end if you grab it by the balls.&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Context Switches are Bad, but Stack Traces are Worse</title>
    <link href="http://teddziuba.com/2009/08/context-switches-are-bad-but-s.html"/>
    <updated>2009-08-17T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/08/context-switches-are-bad-but-s</id>
    <content type="html">
      &lt;img alt=&quot;never-trust-a-person-who-wears-a-tie-who-asks-you-how-to-query-the-database.png&quot; src=&quot;/images/never-trust-a-person-who-wears-a-tie-who-asks-you-how-to-query-the-database.png&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;317&quot; width=&quot;290&quot; /&gt; &lt;div&gt;Every programmer works in silent fear of a manager sneaking up on him and asking him to drop everything he is doing and work on an unrelated task. Context switches like this cost us time and energy, but managers are beginning to figure out that a programmer isn't a machine that can be switched on and off, no, they're understanding that a programmer is a machine that needs a warm up phase. &lt;br /&gt;&lt;br /&gt;I guess you could call that progress.&lt;br /&gt;&lt;br /&gt;I'm fortunate enough to work with a company that understands engineering, but I have worked with my fair share of nontechnical managers, and I have to say that categorically that the most expensive question a manager can ask is &quot;What are you working on?&quot;&lt;br /&gt;&lt;br /&gt;The danger here is when you're six or seven levels deep into yak-shaving, and your manager wants to know what you're doing and why. You need to give the manager a complete stack trace from your current frame all the way up to the original task. Each jump up the call stack is a context switch of its own, where you need to remember exactly why you made the decision that you did, and justify it as the best course of action. &lt;br /&gt;&lt;br /&gt;&quot;I'm compiling a new version of libxml, so I can get the Python parser working.&amp;nbsp; I need to do that because LXML, the Python binding, would crash under heavy load when I used the system default version of libxml. I am using LXML because BeautifulSoup doesn't have support for XPath. I need to do XPath transforms against the input because the legacy system we interact with doesn't send well-formed XML. We tried to get the vendor to fix it but they said 6 to 8 weeks for a patch, and our project deadline is sooner than that. I need to interface with the legacy system because even though the DBAs have ported things over to Oracle, they're still sorting things out and it's not reliable enough for me to make any meaningful progress. Fortunately, I've thought this through and abstracted the data subsystem well enough that I can drop-in replace Oracle when it's ready, so long as the database sends some decent form of XML. Once I get this data subsystem done, I can finish the business logic, which attaches to the nonfunctional demo you love so much.&lt;br /&gt;&lt;br /&gt;So yeah, I'm working on the new asset tracking system.&quot;&lt;br /&gt;&lt;br /&gt;Stack trace, all the way up to the main method.&amp;nbsp; It's not always that simple. I know I have trouble keeping that many stack frames in my head. I can't remember why I chose to go down the path that I did, other than &quot;there was a good reason for it&quot;. After all, I'm not slogging my way though compilation bullshit for my health. When a manager demands a full stack trace like this, it sets your progress back, because you need to go over decisions that you already made, examine the circumstances, and make the same decisions again. You lose your original frame of reference, and your manager thinks you're just fiddledicking around instead of doing work.&lt;br /&gt;&lt;br /&gt;So what's there to do? If you're awesome like me and work for a manager who understands why programming is hard, chances are you can just leave the answer to &quot;what are you doing?&quot; at the innermost stack frame.&amp;nbsp; Everybody wins. However, if your manager is nontechnical, your goal is to get him off your ass as soon as possible, because you want to minimize the damage he does to your productivity. My recommended course of action, when asked &quot;What are you working on?&quot; is to slap the manager in the face and yell &quot;&lt;i&gt;YOU DON'T END A SENTENCE WITH A PREPOSITION UP IN THIS BITCH. THIS IS MY HOUSE.&quot;&lt;/i&gt; Thump your chest with a clenched fist and say &lt;i&gt;&quot;yeah what's now, fool&quot;&lt;/i&gt; under your breath.&lt;br /&gt;&lt;br /&gt;Failing the battery charge, the key phrase is &quot;I explored every option&quot;.&amp;nbsp; Beyond this, there's really no way out, because your manager doesn't trust you. You're basically fucked. Quit your job and come work with me.&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>This Is America, Take Your Unicode Somewhere Else</title>
    <link href="http://teddziuba.com/2009/07/this-is-america-take-your-unic.html"/>
    <updated>2009-07-04T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/07/this-is-america-take-your-unic</id>
    <content type="html">
      &lt;img alt=&quot;i-only-listen-to-NPR-so-i-can-keep-an-eye-on-what-educated-people-are-up-to-its-merely-an-early-warning-system.jpg&quot; src=&quot;/images/i-only-listen-to-NPR-so-i-can-keep-an-eye-on-what-educated-people-are-up-to-its-merely-an-early-warning-system.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;245&quot; width=&quot;320&quot; /&gt;There's a question that comes up on Stack Overflow every couple of months: &quot;How do I strip diacritic marks from Unicode characters?&quot;.&amp;nbsp; Popular variants include &quot;How do I remove special characters&quot; and &quot;How do I convert Unicode to ASCII&quot;, but the underlying motivation is the same: characters that don't have their own key on an American keyboard have no place in modern web software. &lt;div&gt;&lt;br /&gt;Before you go all apeshit on me and call me a bigot and whatnot, read my story.&amp;nbsp; When I was in college, Google hired me for a summer internship.&amp;nbsp; One of my projects that summer was to write Google's employee directory search.&amp;nbsp; Google, as I'm sure you could imagine, is a very multicultural employer.&amp;nbsp; Googlers in general are very accepting of different cultures, customs, and languages.&amp;nbsp; (Well, sort of.&amp;nbsp; Googlers are accepting of multicultural differences like sushi, Diwali parties, and the word &lt;i&gt;namaste&lt;/i&gt;.&amp;nbsp; They're not accepting of cultural differences like Old English 800, 22 inch rims, and the word &lt;i&gt;juicy&lt;/i&gt;. The general rule I figured out as a Googler is that you should welcome diversity so long as it doesn't make you feel guilty for making ten times as much money.)&lt;br /&gt;&lt;br /&gt;Anyhow, as a result of pulling in a lot of foreign talent, my employee directory search had to handle UTF-8 properly.&amp;nbsp; A lot of peoples' names had umlauts, tildes, and other such little nuggets that love to appear as diamonds with question marks in them. I figured, just make the database UTF-8, page encoding UTF-8, and everything should work fine, right?&amp;nbsp; Well it did, in theory.&amp;nbsp; But when the first super-tolerant Googler typed his colleague's name into my search engine, it didn't come up.&amp;nbsp; There was an o with an umlaut in the name, but our hero of race relations simply typed &quot;o&quot;.&lt;br /&gt;&lt;br /&gt;And that came through to me as a bug report.&amp;nbsp; &quot;Strip funny characters.&quot; So I did, and how the searches flowed.&amp;nbsp; See if you can guess how many people would input diacritic marks into the search box.&lt;br /&gt;&lt;br /&gt;Googlers are some of the most understanding people out there, and if they can't be bothered to type Alt-148 for an o with an umlaut, then what hope does the rest of the software industry have?&amp;nbsp; None.&amp;nbsp; That's why I want to systematically dismantle Unicode, and have a good answer to the question &quot;How do I strip diacritic marks?&quot;.&amp;nbsp; Not because handling multibyte character sets is too hard (although that asspain is what prompted me to think about this in the first place), but rather because only a small minority of people actually care about it, and an even smaller minority will whine when their umlauts disappear.&lt;br /&gt;&lt;br /&gt;(To satisfy the pedants, clearly if you're writing software whose job it is to handle and store UTF-8, this advice isn't for you.&amp;nbsp; I'm talking about web services with user input here.)&lt;br /&gt;&lt;br /&gt;Now, you can feel free to take an idealist approach to this problem.&amp;nbsp; Yes, Americans should be more accepting of other cultures and not passively destroy intricate details of pronunciation.&amp;nbsp; Well, feel free to enjoy your floating-point market share.&amp;nbsp; Nobody cares but you.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;End Note&lt;/b&gt;&lt;br /&gt;I found two decent implementations of Unicode transliteration, one in Python and one in Perl. If you know of good implementations in other languages, e-mail me and I'll add them to this list, with SEO-friendly anchor text goodness.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;http://www.tablix.org/%7Eavian/blog/archives/2009/01/unicode_transliteration_in_python/&quot;&gt;Strip diacritic marks in Python&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://search.cpan.org/%7Esburke/Text-Unidecode-0.04/lib/Text/Unidecode.pm&quot;&gt;Strip diacritic marks in Perl&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://www.rgagnon.com/javadetails/java-0456.html&quot;&gt;Strip diacritic marks in Java&lt;/a&gt; (thanks to Simon Lieschke)&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://dev.alt.textdrive.com/browser/HTTP/Unidecode.lua&quot;&gt;Strip diacritic marks in Lua&lt;/a&gt; (for all 8 of you who use it. Thanks to Petite Abeille)&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://blargh.tommymontgomery.com/2009/08/transliteration-in-php/&quot;&gt;Strip diacritic marks in PHP&lt;/a&gt; (thanks to Tommy Montgomery)&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Print Isn't Dying, Serious Journalism Is</title>
    <link href="http://teddziuba.com/2009/06/print-isnt-dying-serious-journ.html"/>
    <updated>2009-06-22T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/06/print-isnt-dying-serious-journ</id>
    <content type="html">
      &lt;img alt=&quot;when-techcrunch-pays-writers-six-figures-then-arrington-can-talk-about-success.jpg&quot; src=&quot;/images/when-techcrunch-pays-writers-six-figures-then-arrington-can-talk-about-success.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;278&quot; width=&quot;200&quot; /&gt; &lt;div&gt;It's a tired Silicon Valley drum beat: print is dying, blogs and Twitter are the future of news.&amp;nbsp; Many in the business of blogging like to think that print ad revenues are declining and subscriber bases are shrinking because online media is vastly superior to those dinosaurs.&amp;nbsp; This is one area where the evidence actually seems to suggest that the bloggers are justified.&lt;br /&gt;&lt;br /&gt;However, if you're not so full of yourself that &quot;citizen journalism&quot; seems like a revolution, you can understand the real reason that print is dying: &lt;i&gt;newspapers' shit is all retarded&lt;/i&gt;.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Too many big words, articles that are way too long, and boring stuff like researched facts.&amp;nbsp; Fuck all that shit, I want my news as it happens, and I don't care how true it is.&amp;nbsp; Bloggers call this process journalism...whatever.&amp;nbsp; That's just writers trying to convince themselves that they're serious when they know deep down that their readership is only interested in sensational titles and text no longer than 300 words.&amp;nbsp; Any more than that, well, shit's all retarded.&lt;br /&gt;&lt;br /&gt;The only satisfying part of journalism turning into shinythings.com is watching intellectuals whine about it.&amp;nbsp; See, I probably should be an intellectual.&amp;nbsp; I've got a degree in mathematics, I'm a computer programmer by trade, but every time I've knocked an article out of the park for The Register, it's been a great troll.&amp;nbsp; That's the only way to get by in online media, and even the New York Times knows this.&lt;br /&gt;&lt;br /&gt;Take for example, NYT columnist Paul Krugman.&amp;nbsp; He won a Nobel Prize in economics, and has been writing the same op-ed column for NYT for the past 8 years: &quot;Republicans are the cause of all the world's ills.&quot;&amp;nbsp; Someone who's shit is arguably all retarded has been reduced to trolling to get page views.&amp;nbsp; And it really works.&lt;br /&gt;&lt;br /&gt;If, as a blogger, you're above trolling, then the only other way to be popular is by printing blatant falsehoods.&amp;nbsp; In 2008, people actually started to pay attention to CNN's iReport because somebody wrote that Steve Jobs had a heart attack. Apple lost 10% of its market capitalization in 10 minutes.&amp;nbsp; Now &lt;i&gt;that's&lt;/i&gt; fucking power.&amp;nbsp; TechCrunch's Michael Arrington, showing an obvious tell of a manic depressive, keeps going off on Last.FM with lies about them giving data away to the recording industry.&amp;nbsp; None of it is true, but it brings readers.&lt;br /&gt;&lt;br /&gt;It certainly doesn't hurt that TechCrunch shies away from words longer than eight letters.&lt;br /&gt;&lt;br /&gt;Print media isn't hurting because it's an outdated business model, print media is hurting because it's boring.&amp;nbsp; Blogs and Twitter are succeeding because their shit is clearly not retarded.&amp;nbsp; And you know what?&amp;nbsp; I love it.&amp;nbsp; Intellectualism is dying, and the news is now anything we want it to be. &lt;br /&gt;&lt;br /&gt;I just can't wait until 4chan figures that out.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Startups: Keep It In Your Pants</title>
    <link href="http://teddziuba.com/2009/06/startups-keep-it-in-your-pants.html"/>
    <updated>2009-06-09T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/06/startups-keep-it-in-your-pants</id>
    <content type="html">
      &lt;img alt=&quot;if-you-read-mike-arringtons-posts-closely-you-see-that-he-has-major-depression-issues.jpg&quot; src=&quot;/images/if-you-read-mike-arringtons-posts-closely-you-see-that-he-has-major-depression-issues.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;300&quot; width=&quot;202&quot; /&gt; &lt;div&gt;I've worked with a lot of engineers around the Valley, some who are genuinely competent and some who can fake it pretty well.&amp;nbsp; One trend that I've noticed with alot of really good engineers is that they like to swing their dicks around when it comes to implementation.&lt;br /&gt;&lt;br /&gt;You start a project with one of these guys, and the first thing to come up is how MySQL isn't going to scale, and how you're going to have to write your own data store.&amp;nbsp; With that settled, you'll also need your own object-relational mapper, and you might as well make your own web templating language because well, it will fit in better with the architecture.&lt;br /&gt;&lt;br /&gt;This, gentlemen, is dick-swinging, and it is the most colossal waste of time for a startup.&lt;br /&gt;&lt;br /&gt;Now, it's a well known fact around Northern California that I'm the greatest programmer who ever lived, and I even fell victim to this.&amp;nbsp; At my last startup, we were absolutely convinced that we were building ourselves into a corner by using MySQL, so we wrote our own data store.&amp;nbsp; It started off as an RPC wrapper around some magical key/value store in Erlang (parallelism, fuck yeah), and ended up as a different RPC wrapper around BerkeleyDB.&amp;nbsp; All in all, it went through three major rewrites, and the end product was something that took months to develop and would crash under moderate load.&lt;br /&gt;&lt;br /&gt;But hey, it was a cool architecture.&lt;br /&gt;&lt;br /&gt;As another small example, again at the last startup I spent a few hours one day writing a feedforward neural network implementation in Java, just to try my hand at implementing an algorithm.&amp;nbsp; Again, a small waste of time, but it was my attitude toward it that signaled a larger problem: I wanted to see how awesome I really was (answer: pretty fuckin' awesome).&lt;br /&gt;&lt;br /&gt;It's not just apartment-bound startups that fall victim to this, either.&amp;nbsp; Kosmix, which is a well funded science project that's fooled itself into thinking it can be a major player in search, wrote its own data store in C++.&amp;nbsp; It's basically a clone of Google's GFS because hey, if Google's doing it, then we should too, right?&amp;nbsp; Who knows how much time, energy, and money was wasted on this thing, but that's all time, money, and energy that could go into making their final product not such a joke.&lt;br /&gt;&lt;br /&gt;Kosmix falls to a different sword: they are well funded and assume they have all the time in the world.&amp;nbsp; Maybe a serious venture round buys you time, but when you spend it all writing a file system that's not core to your product, you start talking Series C, Series D, and so on.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Fortunately, trench-level engineers aren't concerned with dilution.&amp;nbsp; Oops.&lt;br /&gt;&lt;br /&gt;At my current startup, we've got business-focused leadership.&amp;nbsp; We have a good engineering team, and we don't let our hubris get the best of us.&amp;nbsp; There are so few instances where a startup will need to write something like a file system, and we're not one of them.&lt;br /&gt;&lt;br /&gt;As an entrepreneur, you should be prideful of your idea, now how big you think your compiler-cock is.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>My Twitters: Let Me Show You Them</title>
    <link href="http://teddziuba.com/2009/06/my-twitters-let-me-show-you-th.html"/>
    <updated>2009-06-02T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/06/my-twitters-let-me-show-you-th</id>
    <content type="html">
      &lt;img alt=&quot;federal-assault-shark-ban.jpg&quot; src=&quot;/images/federal-assault-shark-ban.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;262&quot; width=&quot;350&quot; /&gt;I signed up for Twitter.&amp;nbsp; Do you people have any idea how fucking important I am?&amp;nbsp; It's a good thing I'm benevolent enough to clue you people into the glory of my day to day operations.&lt;br /&gt;&lt;br /&gt;You should consider it a fucking honor to read my Twitters.&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://twitter.com/dozba&quot;&gt;http://twitter.com/dozba&lt;/a&gt;&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>Hacking Domains by Proxy</title>
    <link href="http://teddziuba.com/2009/06/hacking-domains-by-proxy.html"/>
    <updated>2009-06-02T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/06/hacking-domains-by-proxy</id>
    <content type="html">
      &lt;img alt=&quot;passive-aggressive-and-gullible-is-no-way-to-go-through-life-son.jpg&quot; src=&quot;/images/passive-aggressive-and-gullible-is-no-way-to-go-through-life-son.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;200&quot; width=&quot;200&quot; /&gt;Remember how Uncov.com lapsed registration, and somebody bought it with Domains by Proxy?&amp;nbsp; I'm sure other people have faced this problem: how do you find out who owns a proxy domain?&amp;nbsp; Well, I successfully hacked the system.&lt;br /&gt;&lt;br /&gt;Here's how it works.&amp;nbsp; When someone registers a domain with Domains by Proxy, the e-mail provided to the DNS system for administrative and technical contacts proxy through to the person who actually registered it.&amp;nbsp; If that person directly replies to an e-mail, you can see who actually owns the domain.&lt;br /&gt;&lt;br /&gt;As usual with anything technical, the weakest link is the human.&amp;nbsp; The KGB used to say &quot;it's easier to break fingers than it is to break codes&quot;.&amp;nbsp; And it's easier to exploit greed than it is to subpoena Domains by Proxy or hack their computers.&lt;br /&gt;&lt;br /&gt;Check this shit out:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;who-the-fuck-you-think-you-fuckin-with-im-the-fuckin-boss.png&quot; src=&quot;/images/who-the-fuck-you-think-you-fuckin-with-im-the-fuckin-boss.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;302&quot; width=&quot;672&quot; /&gt;&lt;br /&gt;&lt;br /&gt;Names hidden to protect the douchey, but if you've got ten thousand extra dollars hanging around, you can have uncov.com all for yourself.&lt;br /&gt; &lt;div&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Don't be a Menace to South Central</title>
    <link href="http://teddziuba.com/2009/05/dont-be-a-menace-to-south-cent.html"/>
    <updated>2009-05-25T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/05/dont-be-a-menace-to-south-cent</id>
    <content type="html">
      &lt;img alt=&quot;mapreduce-reduces-the-map-of-the-web.jpg&quot; src=&quot;/images/mapreduce-reduces-the-map-of-the-web.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;330&quot; width=&quot;240&quot; /&gt;The internet has this nasty habit of violently breaking down market inefficiencies.&amp;nbsp; For example, the music market was inefficient because driving to a store and exchanging money for a record was too much effort.&amp;nbsp; Downloading an album for free is a virtually frictionless process, so Napster, KaZaA and others thrived.&lt;br /&gt;&lt;br /&gt;Now, I think the word &quot;disruption&quot; is mostly used by half-wits in the media to pimp something.&amp;nbsp; When you read that something is &lt;i&gt;disruptive&lt;/i&gt;, that's an obvious tell that the journalist doesn't have a fucking clue about the technology, but is pretending that he does.&amp;nbsp; Journalists love that shit, and so do editors.&amp;nbsp; However, as an entrepreneur, you haven't created a disruption until a group of powerful old men convene in a board room to figure out how to shut you down.&amp;nbsp; You haven't created a disruption until the government is trying to regulate you.&amp;nbsp; You haven't created a disruption until there's a media campaign &lt;i&gt;against&lt;/i&gt; you.&lt;br /&gt;&lt;br /&gt;That being said, I love the idea of paid posting and sponsored conversations: companies paying bloggers to talk about their shit.&amp;nbsp; Why? Because it's really pissing off people who make a living out of public relations.&lt;br /&gt;&lt;br /&gt;When I was writing Uncov, I would get several e-mails daily from PR agencies, pitching me a story on such and such a shitty startup.&amp;nbsp; This is how it works: your company pays a PR agency for the size of their Rolodex.&amp;nbsp; The PR agency spams the publications with your press release in hopes that the story gets picked up.&amp;nbsp; In the tech media, the hit rate for PR isn't terribly high, so you end up spending upwards of $10,000 per month on PR that only gets your company a few writeups.&amp;nbsp; It's a scam.&lt;br /&gt;&lt;br /&gt;A company like PayPerPost, now Izea, has removed that market inefficiency.&amp;nbsp; You can simply pay bloggers directly to write about you.&amp;nbsp;&amp;nbsp; Whether or not they disclose that they're being paid, well, who gives a shit? You get the Google link juice, you get the attention, and if it comes out that you paid for it, the internet has the attention span of a fruit fly, so everyone will forget about in 24 hours.&lt;br /&gt;&lt;br /&gt;Any blogger that takes a stand against paid posting is delusionally self-important.&amp;nbsp; There is no morality to &quot;citizen journalism&quot; by definition.&amp;nbsp; The idea is that the traditional media will die in favor of hundreds of thousands of individual reporters, all working for themselves.&amp;nbsp; The good ones will rise to the top, but everybody will keep talking.&amp;nbsp; In this type of scheme, there is no force whatsoever that can stop paid placement.&amp;nbsp; With a few large media outlets, like The New York Times, The Washington Post, and other newspapers, paid placement isn't an issue because so much credibility is at stake.&lt;br /&gt;&lt;br /&gt;Not so on the internet.&amp;nbsp; Complain about it all you want, but paid placement is a necessary side effect of user-generated media.&amp;nbsp; Regulations against it are like speed limits: there's bound to be some marginal enforcement, but by and large, it does nothing to prevent it.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;It's what we wanted, now it's what we've got.&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>Startup Dad</title>
    <link href="http://teddziuba.com/2009/05/-the-first-time-you.html"/>
    <updated>2009-05-21T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/05/-the-first-time-you</id>
    <content type="html">
      &lt;meta http-equiv=&quot;CONTENT-TYPE&quot; content=&quot;text/html; charset=utf-8&quot;&gt;
      &lt;title&gt;&lt;/title&gt;
      &lt;meta name=&quot;GENERATOR&quot; content=&quot;OpenOffice.org 3.0  (Unix)&quot;&gt;
      &lt;style type=&quot;text/css&quot;&gt;
      &lt;!--
      @page { margin: 0.79in }
      P { margin-bottom: 0.08in }
      --&gt;
      &lt;/style&gt;

      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;	&lt;/p&gt;&lt;img alt=&quot;babbyform.JPG&quot; src=&quot;/images/babbyform.JPG&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;274&quot; width=&quot;365&quot; /&gt;&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;The first time you hold somebody
      else's screaming baby, you understand immediately why prostitution is
      the world's oldest profession.  The first time you hold your own
      screaming baby, you understand immediately why a federal prisoner who
      gunned down three police officers needs no moral justification for
      sticking the sharpened end of a toothbrush into a freshly jailed
      child molester's kidney. &lt;br /&gt;&lt;/p&gt;&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style=&quot;margin-bottom: 0in;&quot;&gt; It's that first revelation that scares many
      first time fathers into escaping responsibility like a jackrabbit
      from a coyote.  The difference between the ones who run and the ones
      who stay, really, is how the news was broken to them:  a great
      comedian will tell you that they key to any good joke is the
      delivery.
      &lt;/p&gt;
      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;
      &lt;/p&gt;
      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;	A runner was out on a Tuesday night
      with some of his friends at a bar, somewhere between his third drink
      and fourth cigarette, when the girl he'd been fucking calls him up
      and says that she's pregnant.  A dad who sticks around is
      concentrating on some manly order of housework, like changing the oil
      in a car, when his wife or girlfriend calls him in to make sure that
      the little blue plus sign actually does mean &amp;#8220;pregnant&amp;#8221;, and that
      she's not just misreading it.&lt;/p&gt;
      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;
      &lt;/p&gt;
      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;	With a baby in the house, the bedroom
      is going to lose its understated but victorious smell of Astroglide
      and unwashed sheets in favor of a strong presentation of rancid
      breast milk.  When there's a child to take care of, getting falling
      down drunk to the point where you're willing to argue with a street
      vendor over the price of a 2AM hot dog isn't really an option in the
      list of things to do this weekend.  With a baby, all of the money you
      used to spend on video games and car accessories is going to be
      repurposed for child care.
      &lt;/p&gt;
      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;
      &lt;/p&gt;
      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;	When a man runs from fatherhood, he's
      not really running from responsibility, he's running from the guilt
      of a mediocre life.
      &lt;/p&gt;
      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;
      &lt;/p&gt;
      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;	Without the responsibility of a baby,
      there's still time to salvage it.  A month after disappearing,
      though, a runner realizes the vicious truth: that no amount of time
      or things-not-to-be-responsible-for will turn an unaccomplished life
      into one your eventual children will look up to.  Fleeing your responsibility
      and making that new year's resolution to get your life on track is as
      effective as telling yourself that you have the courage to ask out a
      girl as you masturbate.  No number of promises will ever amount to
      motivation.&lt;/p&gt;
      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;
      &lt;/p&gt;
      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;	For those of us that stay in the
      picture, ambition has a new meaning. I'm twenty five years old, a
      software engineer on the startup circuit in Silicon Valley.  I'm not
      in the business so that I get invited to speak at conferences.  I'm
      not an entrepreneur because I want to feel important.  I'm in this
      game now to provide for my family.  At first, I thought that a
      startup was the only part of my youth left breathing, but now, I know
      that having a picture of my daughter stuck to my monitor is the best
      motivation there is.  If you're the type to man up to what's demanded
      of you, a baby won't throw your entrepreneurship game off.
      &lt;/p&gt;
      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;
      &lt;/p&gt;
      &lt;p style=&quot;margin-bottom: 0in;&quot;&gt;Just make sure you're funded.&lt;/p&gt;

    </content>
  </entry>

  <entry>
    <title>Disable The Annoying Thing In Ubuntu Jaunty</title>
    <link href="http://teddziuba.com/2009/04/disable-the-annoying-thing-in.html"/>
    <updated>2009-04-25T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/04/disable-the-annoying-thing-in</id>
    <content type="html">
      &lt;img alt=&quot;at-least-they-didnt-fuck-up-my-nvidia-drivers-this-time.jpg&quot; src=&quot;/images/at-least-they-didnt-fuck-up-my-nvidia-drivers-this-time.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;230&quot; width=&quot;160&quot; /&gt;I upgraded to Ubuntu Jaunty Jackalope today.&amp;nbsp; It was a positive experience until I opened Pidgin and Evolution and started using my computer.&amp;nbsp; Jaunty has this new feature called NotifyOSD that application makers can use to bug the shit out of you at every possible moment.&amp;nbsp; Someone signed online? Bug the shit out of the user.&amp;nbsp; Received an e-mail, bug the shit out of the user.&amp;nbsp; Joined a wireless network?&amp;nbsp; You guessed it.&amp;nbsp; Let's bug the shit out of the user.&lt;br /&gt;&lt;br /&gt;The old notifier used to stay out of your way.&amp;nbsp; Get a little message or whatnot when you got a new e-mail.&amp;nbsp; It was unobtrusive and didn't distract you while you're trying to figure out why some little bit of SQLAlchemy code is making too many calls to a database.&amp;nbsp; But now, Canonical has found it necessary to make sure you're abundantly aware of every excruciating detail of your computer's operation.&amp;nbsp; Productivity be damned.&lt;br /&gt;&lt;br /&gt;I don't know whose bright idea this feature was, but whoever it is is trying to spread their terminal case of attention deficit disorder to the rest of the world.&amp;nbsp; Fuck you.&amp;nbsp; Grind up your Adderall pills and snort them until your heart shits out like a Chevy.&lt;br /&gt;&lt;br /&gt;Anyway, if you don't know what I'm talking about, this is the offending window:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;i-hope-youre-proud-of-yourself.png&quot; src=&quot;/images/i-hope-youre-proud-of-yourself.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;476&quot; width=&quot;733&quot; /&gt;&lt;b&gt;How To Turn This Thing Off&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Open up a command line.&amp;nbsp; Type this:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;sudo mv /usr/share/dbus-1/services/org.freedesktop.Notifications.service /usr/share/dbus-1/services/org.freedesktop.Notifications.service.disabled&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;And restart your computer.&amp;nbsp; You could probably restart the dbus daemon, but that makes a lot of things go ill on your machine.&lt;br /&gt;&lt;br /&gt;This disables the notifier for good.&amp;nbsp; Now you can get back to work.&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>DiggBar is a Howl of Desperation</title>
    <link href="http://teddziuba.com/2009/04/diggbar-is-a-howl-of-desperati.html"/>
    <updated>2009-04-10T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2009/04/diggbar-is-a-howl-of-desperati</id>
    <content type="html">
      &lt;img alt=&quot;silicon-valley-is-decadent-and-depraved.jpg&quot; src=&quot;/images/silicon-valley-is-decadent-and-depraved.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;270&quot; width=&quot;280&quot; /&gt; &lt;div&gt;Since the most recent death of Uncov, I've tried to lay off the Web 2.0 shit.&amp;nbsp; However, as a conoisseur of fail, I thought the DiggBar is worth an examination.&lt;br /&gt;&lt;br /&gt;DiggBar is a URL shortening service from Digg, the internet's largest community of whiners, armchair political activists, inconsolable Book-of-Steve-Jobs bible beaters, and automatic voting bots.&amp;nbsp; The long and short of it is this: you can put any address into it, and it will give you a way to view that URL through Digg.com.&amp;nbsp; For example, &lt;a href=&quot;http://digg.com/u1hrO&quot;&gt;http://digg.com/u1hrO&lt;/a&gt; brings you back here, except with a Digg toolbar at the top.&lt;br /&gt;&lt;br /&gt;There's been a small wave of butthurt over this little scheme, because every link on the front page of Digg.com now leads you to one of these toolbars instead of to the actual content.&amp;nbsp; So, when a Digg user clicks through, he never actually leaves Digg.com.&amp;nbsp; They've done some of the stuff necessary so that publishers don't get shafted on the traffic or the PageRank (and still managed to fuck that up), but still, that little bar adds little to no value to the user.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Then why did they do it?&lt;br /&gt;&lt;br /&gt;When an entrepreneur raises any useful amount of money from an investor, he needs to answer to that investor.&amp;nbsp; The CEO of a company reports to the board of directors, which usually includes the investors.&amp;nbsp; Every quarter, the CEO must lay out goals and objectives, and for internet companies, these goals and objectives always, &lt;b&gt;&lt;i&gt;always&lt;/i&gt;&lt;/b&gt; include traffic growth.&lt;br /&gt;&lt;br /&gt;Digg has raised 40 million dollars to date.&amp;nbsp; With that kind of money, investors demand explosive growth.&amp;nbsp; Since the economy has gone to shit, there's a very slim chance that Digg will see a sale before it needs to raise more money.&amp;nbsp; As a small company, Digg could have been a very profitable business, but instead they took too much money and made too many expectations for themselves.&amp;nbsp; I can guarantee you that Jay Adelson (CEO) and Kevin Rose have some demanding goals to meet, and lately, they haven't been meeting them.&lt;br /&gt;&lt;br /&gt;Hence the introduction of this DiggBar business.&amp;nbsp; When a link makes its way to the top of Digg, it gets republished quite a bit.&amp;nbsp; Now that all these links will land a user at Digg.com, Digg that collects the unique users from this collateral linkage.&amp;nbsp; And it's working, too.&amp;nbsp; In a recent interview, VP John Quinn of Digg said that the DiggBar has given them a 20% boost in unique visitors.&lt;br /&gt;&lt;br /&gt;This move shows that not only is Digg willing to pull some sleazy shit to increase their unique visitors, but that they also &lt;i&gt;need&lt;/i&gt; to pull this sleazy shit, because they need more unique visitors.&lt;br /&gt;&lt;br /&gt;Damn.&amp;nbsp; And I would have gotten away with it too, if it weren't for you meddling kids.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Footnotes.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Uncov.com died again.&amp;nbsp; I never owned the domain name, one of my business partners did.&amp;nbsp; The registration on it recently lapsed, and somebody picked it up.&amp;nbsp; It now redirects to a Twitter search for &quot;kevinrose&quot;.&amp;nbsp; If you're the one who bought it, good show.&amp;nbsp; Thanks for not being a spammer.&amp;nbsp; I'm not willing to buy it back from you, but if you want to give it to me, I will take you out for a beer to congratulate your achievement.&lt;/li&gt;&lt;li&gt;No, I did not get fired from The Register.&amp;nbsp; My wife and I had a daughter last week, and I am taking some time off.&amp;nbsp; Not sure when I'll return yet, but I will.&amp;nbsp; I just need to get my sleeping back on schedule.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Be a Better Blogger. Stop Reading Blogs.</title>
    <link href="http://teddziuba.com/2009/03/be-a-better-blogger-stop-readi.html"/>
    <updated>2009-03-08T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2009/03/be-a-better-blogger-stop-readi</id>
    <content type="html">
      &lt;img alt=&quot;three-days-hike-to-the-douchebag-dharma-station.jpg&quot; src=&quot;/images/douchebag-dharma-station.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;478&quot; width=&quot;206&quot; /&gt; &lt;div&gt;The greatest hope of internet generation is that you can share your thoughts with everybody in the world.&amp;nbsp; The greatest letdown of the same generation is that nobody cares.&amp;nbsp; Still, that doesn't keep us from trying.&lt;br /&gt;&lt;br /&gt;Bloggers are good people, generally.&amp;nbsp; We're self confident in a passive aggressive sort of way, we're opinionated, and best of all, we can type quickly.&amp;nbsp; But what separates bloggers from each other?&amp;nbsp; Those who can break exclusive news usually have a good following, but what about the rest of us?&amp;nbsp; How do you actually get better at blogging?&lt;br /&gt;&lt;br /&gt;Not that you asked for it, but this is my advice: read more, write less.&amp;nbsp; By &quot;read more&quot;, I mean books.&amp;nbsp; Newspapers are useless, because the job of every newspaper editor is to remove any semblance of personality from all of the text.&amp;nbsp; It's just facts, and facts are fuckin' &lt;i&gt;boring&lt;/i&gt;.&amp;nbsp; If you want to cruise programming.reddit a few times a day and write reactionary articles, fine, live with that crowd.&amp;nbsp; It's not an interesting place to be, though.&amp;nbsp; Telling the world why you think DHH is wrong about some programming methodology isn't going to get you a column at Rolling Stone.&lt;br /&gt;&lt;br /&gt;Getting on the front page of Digg is not an accomplishment.&lt;br /&gt;&lt;br /&gt;Blogarrhea begets blogarrhea.&amp;nbsp; There's a continuous global discussion on the internet, and if you're not the one who started it, you're just background noise.&amp;nbsp; People like Paul Graham and Dave Winer never really say anything original, they just enjoy the act of typing.&amp;nbsp; Graham has been re-writing the same three essays for almost a decade, and Winer, well, Winer doesn't have much to do during the day, and at least blogging keeps him away from drugs and rap music. I guess it's a positive influence.&lt;br /&gt;&lt;br /&gt;If you want to keep that company, do so, but like programming, writing is so much better when you value elegance as well as functionality.&lt;br /&gt;&lt;br /&gt;Which brings me to my second point.&amp;nbsp; Write less.&lt;br /&gt;&lt;br /&gt;For the last two months, I have been working my way through a pile of books: everything ever published by &lt;a href=&quot;http://en.wikipedia.org/wiki/Chuck_palahniuk&quot;&gt;Chuck Palahniuk&lt;/a&gt; (tl;dr: the guy who wrote Fight Club).&amp;nbsp; I'm almost done, a book and a half to go.&amp;nbsp; Chuck likes to do these writers' workshops, and somebody once asked him what he does when he's stuck.&amp;nbsp; He knows where the story needs to go, but just doesn't know how to get it there.&lt;br /&gt;&lt;br /&gt;Chuck's response: &quot;Did you ever go into the bathroom and try and take a shit when you didn't have to go?&quot;&lt;br /&gt;&lt;br /&gt;Whenever I sit down to write a post here, it's because I really have to take a dump.&amp;nbsp; Incidentally, sometimes when I write for The Register, it feels like I'm really trying to squeeze one out.&amp;nbsp; If I end up dead from an aneurysm, that's what happened.  Setting a post-per-week quota for yourself is like setting a lines-of-code quota at work.&lt;br /&gt;&lt;br /&gt;Don't write just because you want to spend some time on the pot.&amp;nbsp; Do it because you really have to go.&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Effective Vices for the IT Professional</title>
    <link href="http://teddziuba.com/2009/02/effective-vices-for-the-it-pro.html"/>
    <updated>2009-02-08T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2009/02/effective-vices-for-the-it-pro</id>
    <content type="html">
      &lt;img alt=&quot;practicing-depravity-makes-you-better-at-it.jpg&quot; src=&quot;/images/practicing-depravity-makes-you-better-at-it.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;240&quot; width=&quot;320&quot; /&gt; &lt;div&gt;There's a blog post that snakes through the programming community every three months: the one about only hiring programmers who program in their spare time.&amp;nbsp; It's always the same person who writes it, too.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;He's a sequentially numbered employee at a company with a well-tracked ticker symbol, and his only outlet of authority is sweating down some poor sod in a windowless interview room, asking questions about sorting integers in linear time.&lt;br /&gt;&lt;br /&gt;The truth of it is, after a day of writing JUnit tests to achieve the corporate-policy-mandated code coverage metric, you don't need to go home to a Haskell compiler.&amp;nbsp; You need to go home to a tall drink and a depraved presentation of human sexuality.&amp;nbsp; Corporate coding sucks, and if there's no vice to counteract it, you'll be dead of an aneurysm by age forty.&amp;nbsp; They'll find you on the toilet, pants down, your copy of Design Patterns unceremoniously splayed open on the floor.&lt;br /&gt;&lt;br /&gt;Programming isn't a glamorous job, and pretending that it is won't make you any better at it.&lt;br /&gt;&lt;br /&gt;I've been studying some techniques for decompressing the tension built up by JBoss and WebSphere in my personal lab for quite some time now.&amp;nbsp; I'm not a corporate coder anymore, but when I was, I studied ways to make it easier on the head.&amp;nbsp; I'm now ready to share my results with the scientific community.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Drinking&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Alcohol is the most obvious medication because it's cheap and readily available.&amp;nbsp; Parked on the couch, racing your way to the bottom of a highball glass of Chivas Regal is a fantastic way to forget that the hour you spent in a meeting watching two type-A personalities fiercely debate Scrum versus XP is one hour less of the life you wanted.&amp;nbsp; The downside is that one drink usually leads to three or four, and you waste the drunkenness on an early sleep because you need to get up early the next day and do it all over again.&lt;br /&gt;&lt;br /&gt;Alcohol interrupts your sleep, and if you're going to stay sharp at work, you need a good rest.&amp;nbsp; If you're one of the damned souls like me, you get vicious hangovers, to the point where swimming in drink for a night isn't even worth it, if you're going to spend the next day wishing that you'd died of alcohol poisoning.&lt;br /&gt;&lt;br /&gt;That being said, at some time in your programming career you need to go to work with a severe hangover, out of sticking it to the man by way of martyrdom.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Drugs&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Here's where the process gets dicey.&amp;nbsp; When you wear a button-down shirt to work, you're not the usual type of person who ends up in county lock-up on a possession charge.&amp;nbsp; Marijuana is certainly a better option than alcohol in every way, shape, and form, but like it or not, it is illegal.&amp;nbsp; There are some exceptions here in California, but you're still rolling the dice&amp;nbsp; Getting pinched could mean getting fired.&lt;br /&gt;&lt;br /&gt;The scary shit comes as rocks or powders.&amp;nbsp; Again, being the khakis-and-necktie crowd, nobody really expects you to be shooting black-horse heroin in the shower.&lt;br /&gt;&lt;br /&gt;There is a convenient edge case when it comes to drugs, though.&amp;nbsp; Prescription painkillers, when used appropriately, really take the edge off of reality.&amp;nbsp; Again, you run the risk of upsetting John Q. Law, so make sure it's legit.&amp;nbsp; While they make for good entertainment in the evening, there's a real possibility that you can get addicted, and once a vice starts interfering with your work, then you're fucked.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Strippers&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you're the type that easily takes to strippers, you had better come ready to peel of the cabbage: this vice doesn't come cheap.&amp;nbsp; There's also a simple but unrelenting set of rules you need to learn to keep from getting your ass kicked by a bouncer.&amp;nbsp; It's the kind of thing you'll pick up as you go.&lt;br /&gt;&lt;br /&gt;For the programmer or IT professional, strippers are an excellent choice.&amp;nbsp; You usually show up to the gentleman's establishment with a bit more money than any of the other clients, so you'll be Mr. Popular.&amp;nbsp; Just be respectful of what's going on: it's not so much a smut show as it is a first hand demonstration in a loosely regulated free market.&amp;nbsp; The dancers are there to make a buck, and don't you forget it.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Tobacco&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Tobacco is a great vice for the programmer because it's a performance enhancing drug as well as an escape.&amp;nbsp; I recommend either cigars or smokeless tobacco to avoid the growing anti-cigarette movement.&amp;nbsp; A cigar is gangster, and chewing tobacco is stealth.&lt;br /&gt;&lt;br /&gt;After putting down a nice stogie wrapped in Connecticut shade, you'll feel like it's time for action.&amp;nbsp; Nicotine is a fantastic stimulant - better than caffeine.&amp;nbsp; If you're the work-from-home type, smoking a cigar twenty minutes before you start will send you on your way in a hurry.&amp;nbsp; Avoid the dregs, though: don't buy a cigar in any place that sells gasoline.&lt;br /&gt;&lt;br /&gt;Chewing tobacco is often overlooked.&amp;nbsp; Yeah, you say it's more of a staple with the Nascar crowd, but that's really just a stereotype invented by the Nascar crowd, designed to keep you damned hoity-toity folks from driving up the cost of a can of chaw.&lt;br /&gt;&lt;br /&gt;The key part about dip is that you can do it at your desk.&amp;nbsp; Spit into an empty Coke bottle.&amp;nbsp; Nobody will come by to bother you.&amp;nbsp; Plus, think of how authoritative you're going to be at a meeting when you start it off by lipping a fat digger out of a tin of Skoal.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Just Keep It Within Reason&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;You can judge any vice on two dimensions: how good is it, and how likely is it to interfere with your work.&amp;nbsp; Once a vice becomes more than a vice, you're going to &lt;i&gt;wish&lt;/i&gt; you were that guy who goes home to code Haskell.&lt;br /&gt;&lt;br /&gt;However, there is a convenient side-effect to the addictiveness.&amp;nbsp; If you are aware enough to see your vice getting out of hand, it's probably time to quit your job.&lt;br /&gt;&lt;br /&gt;Just don't do anything illegal.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Corporate Blogs: It's The PageRank, Stupid</title>
    <link href="http://teddziuba.com/2009/01/corporate-blogs-its-the-pagera.html"/>
    <updated>2009-01-19T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2009/01/corporate-blogs-its-the-pagera</id>
    <content type="html">
      &lt;img alt=&quot;still-not-giving-mint-my-banking-information.jpg&quot; src=&quot;/images/still-not-giving-mint-my-banking-information.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;256&quot; width=&quot;273&quot; /&gt; &lt;div&gt;If you're running an online business and have hired a consultant who tells you that you should have a corporate blog to &quot;better connect with the community&quot;, fire that consultant.&lt;br /&gt;&lt;br /&gt;If you have a corporate blog that is only marginally more interesting than a press release wire, you're wasting your time.&lt;br /&gt;&lt;br /&gt;A corporate blog should serve only one primary purpose: distribution.&amp;nbsp; And I'm not talking about building brand recognition by getting people to read your blog.&amp;nbsp; Nine times out of ten, the text on your corporate blog is a chore to read.&amp;nbsp; Even Google fails this - their pathological cuteness and lame humor comes off as contrived.&amp;nbsp; It's not funny.&amp;nbsp; It's irritating.&lt;br /&gt;&lt;br /&gt;Anyway, how does a blog get you distribution if you're not concentrating on branding?&amp;nbsp; PageRank.&amp;nbsp; You can and should use your blog for link-building and search engine optimization.&lt;br /&gt;&lt;br /&gt;A great example of this is &lt;a href=&quot;http://www.mint.com/blog/&quot;&gt;Mint.com's blog&lt;/a&gt;.&amp;nbsp; Mint is a personal finance web product that competes with desktop apps like Quicken.&amp;nbsp; Mint publishes longer articles about personal finance to their blog, and have several thousand readers.&amp;nbsp; That alone is interesting, but not mind-blowing.&amp;nbsp; The trick is that their content is &lt;i&gt;useful&lt;/i&gt;.&amp;nbsp; It's basically a magazine about personal finance without the advertisements.&amp;nbsp; Social media picks up on Mint's content, and it gets a lot of inbound links.&lt;br /&gt;&lt;br /&gt;Mint takes gross advantage of those inbound links.&amp;nbsp; That's the whole point.&amp;nbsp; At the bottom of every blog post is this little nugget:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;mint-screenshot.png&quot; src=&quot;/images/mint-screenshot.png&quot; class=&quot;mt-image-center&quot; style=&quot;border: 1px solid black; margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;182&quot; width=&quot;654&quot; /&gt;&lt;br /&gt;A-ha, I see what you're doing there.&amp;nbsp; Mint is juicing their PageRank with the popularity of the blog.&amp;nbsp; If you're a personal finance website, chances are you want to optimize for some of these keywords.&amp;nbsp; And it's really working for them.&lt;br /&gt;&lt;br /&gt;If you use Google's Keyword Tool to estimate the traffic for these keywords, find Mint's rank in the result page for each of them, and then multiply keyword traffic by the distribution of clicks for the top results in Google, you'll see that Mint is raking in at least 100,000 uniques per month from Google for these keywords.&lt;br /&gt;&lt;br /&gt;If you hire a writer to post on your corporate blog, you could be seeing this kind of traffic, too.&amp;nbsp; By &quot;writer&quot;, I don't mean &quot;Peggy in accounts receivable who majored in English thirty years ago&quot;.&amp;nbsp; No, I mean someone whose words are worth reading.&amp;nbsp; A decent freelancer will run you 50 cents per word.&amp;nbsp; A good length blog post is 1,000 words, and you should publish at least once per week.&amp;nbsp; 5 posts like this per month will cost $2,500.&lt;br /&gt;&lt;br /&gt;Now let's compare that to buying traffic from Google by bidding on these keywords.&amp;nbsp; A really, &lt;i&gt;really&lt;/i&gt; conservative estimate of a bid price for keywords like this is 10 cents (but good luck ranking with that bid, cheapskate).&amp;nbsp; To buy 100,000 uniques would therefore cost you $10,000 per month, &lt;i&gt;and&lt;/i&gt; you don't get the PageRank.&lt;br /&gt;&lt;br /&gt;Of course, the success of this strategy isn't as quantifiable as buying ads, but eventually you'll see traffic throughput.&amp;nbsp; Any writer worth his salt will be able to game social media sites like Digg and Reddit, which will bring in the backlinks.&amp;nbsp; All you need to do is figure out what keywords to optimize for, and put them in the blog template.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Every day I'm hustlin'&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Advice to Old Men from a Young Man</title>
    <link href="http://teddziuba.com/2009/01/advice-to-old-men-from-a-young.html"/>
    <updated>2009-01-17T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2009/01/advice-to-old-men-from-a-young</id>
    <content type="html">
      &lt;img alt=&quot;billy-mays-is-still-cooler.jpg&quot; src=&quot;/images/billy-mays-is-still-cooler.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;250&quot; width=&quot;244&quot; /&gt;1. Unless you were shooting Kennedy, nobody cares where you were when Kennedy was shot.&lt;br /&gt;&lt;br /&gt;2. The left lane on the freeway is a young man's game.&lt;br /&gt;&lt;br /&gt;3. Things will always get more expensive.&amp;nbsp; Bitching about the cost of gasoline isn't going to make it any cheaper.&amp;nbsp; Corollary: nobody cares that a gallon of gasoline used to cost a nickel.&lt;br /&gt;&lt;br /&gt;4. War stories: keep them coming.&lt;br /&gt;&lt;br /&gt;5. If you have a prosthetic hook-arm, it's your duty to use it to scare children.&amp;nbsp;&amp;nbsp; Corollary to #4, your prosthetic hook-arm makes a war story way better.&amp;nbsp; If you didn't lose your arm in a war, make up a good war story to explain it.&amp;nbsp; Nobody will know the difference.&lt;br /&gt;&lt;br /&gt;6. The world doesn't owe you anything.&lt;br /&gt;&lt;br /&gt;7. Respect your youngers.&amp;nbsp; We're the ones who will pay your Social Security and take care of you when you're enfeebled.&lt;br /&gt;&lt;br /&gt;8. Advice you offer to young men should fall into one of these three categories:&lt;br /&gt;&amp;nbsp;&amp;nbsp; A. The finer points of tolerable behavior when it comes to strippers&lt;br /&gt;&amp;nbsp;&amp;nbsp; B. Recommendations on quality whiskeys&lt;br /&gt;&amp;nbsp;&amp;nbsp; C. Sticking it to the man&lt;br /&gt;&lt;br /&gt;9. If you're past the point where people depend on you, eat, smoke, drink, and gamble.&amp;nbsp; We young men must control our vices, but you've earned the right to indulge with reckless abandon.&amp;nbsp; Show us what we have to look forward to.&lt;br /&gt;&lt;br /&gt;10. You keep getting older, but they stay the same age.&amp;nbsp; From a young man's perspective, a 65 year old man with a 23 year old woman isn't a shame, it's a victory.&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>Buying Sea Salt?  You Might Be a Sucker.</title>
    <link href="http://teddziuba.com/2009/01/buying-sea-salt-you-might-be-a.html"/>
    <updated>2009-01-11T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2009/01/buying-sea-salt-you-might-be-a</id>
    <content type="html">
      &lt;img alt=&quot;see-also-hypertension.jpg&quot; src=&quot;/images/see-also-hypertension.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;240&quot; width=&quot;320&quot; /&gt;If there's one thing I have a lot of contempt for, it's neo-hippie bullshit.&amp;nbsp; However, my appreciation for fresh produce just barely overthrows this contempt, so sometimes I go shopping at &lt;a href=&quot;http://www.berkeleybowl.com/&quot;&gt;The Berkeley Bowl&lt;/a&gt; for all kinds of fruits and vegetables that I've never heard of.&amp;nbsp; Really, they have some wonky shit there.&amp;nbsp; Ever see a &lt;a href=&quot;http://en.wikipedia.org/wiki/Buddha%27s_hand&quot;&gt;Buddha's Hand&lt;/a&gt;?&lt;br /&gt;&lt;br /&gt;Anyhoo, they sell sea salt there.&amp;nbsp; Salt, like 'out the ocean.&amp;nbsp; And people buy it.&amp;nbsp; And those people are morons.&lt;br /&gt;&lt;br /&gt;If you buy sea salt, you're paying a premium for the luxury of being a douchebag.&amp;nbsp; It's salt.&amp;nbsp; It has no discernible flavor other than &lt;i&gt;salty&lt;/i&gt;, it has no metric of quality other than &lt;i&gt;not mixed with dirt and glass shards&lt;/i&gt;, and it should have no variation in price other than &lt;i&gt;cheap&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;You can buy 4 pounds of standard issue table salt for $5.37 on the internet.&amp;nbsp; Alternatively, I've seen 4 ounces of sea salt for sale for $2.39.&amp;nbsp; That's a markup of roughly 712%.&amp;nbsp; It's a pretty good business if you're selling salt.&lt;br /&gt;&lt;br /&gt;In fact, sea salt might even be bad for you.&amp;nbsp; Regular salt has been used for years as a vehicle for iodine, a chemical your body needs to keep you from becoming a retard.&amp;nbsp; No bullshit, iodine deficiency can cause mental retardation.&amp;nbsp; It only costs a dollar or so to iodize a ton of salt, so it really is ideal.&amp;nbsp; Most sea salt isn't iodized, because it's sold as &quot;natural&quot;.&amp;nbsp; Boy, a lesser product for way more money?&amp;nbsp; Where do I sign up?&lt;br /&gt;&lt;br /&gt;Some people claim to be able to distinguish the &quot;superior flavor&quot; of sea salt.&amp;nbsp; These are the same kinds of people who keep a fridge stocked with gallons of bottled water and don't use the tap for anything but watering a house cactus.&amp;nbsp; If you are one of these people, you should kill yourself as a public service.&amp;nbsp; The only real difference between sea salt and table salt you'll feel when you eat it is the coarseness of sea salt.&amp;nbsp; That's it.&amp;nbsp; And coarse salt isn't worth fucking ten dollars a pound.&lt;br /&gt;&lt;br /&gt;tl;dr if you're buying sea salt, consider yourself successfully marketed to.&amp;nbsp; It's like Fiji water.&amp;nbsp; You got hustled.&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>There Will Be No Web 3.0</title>
    <link href="http://teddziuba.com/2008/12/there-will-be-no-web-30.html"/>
    <updated>2008-12-21T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2008/12/there-will-be-no-web-30</id>
    <content type="html">
      &lt;img alt=&quot;husslin-vs-ballin-the-eternal-struggle.jpg&quot; src=&quot;/images/husslin-vs-ballin-the-eternal-struggle.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;284&quot; width=&quot;250&quot; /&gt;The recession reached its hand into Silicon Valley's now lukewarm tub and yanked the plug.&amp;nbsp; It's still draining out, and I wish it would go faster, because there are just too many fucking people in the San Francisco Bay Area.&amp;nbsp; I'm talking about you, guy in your Prius taking the left hand turn on to Middlefield Road too slowly.&amp;nbsp; Leave, now.&amp;nbsp; And don't come back.&amp;nbsp; Bonus points for wrapping your expression of environmental consciousness around a tree.&amp;nbsp; Be one with nature.&lt;br /&gt;&lt;br /&gt;The guy who drives the Prius likely works at a Web 2.0 company that's burning its way through the $4 million it raised from Me2 Ventures, one of the many sheep-funds in the Valley who follow the trends of top-tier investors like Sequoia or DFJ but don't have the connections to pull liquidity out of hype.&lt;br /&gt;&lt;br /&gt;In two years, this guy's company will finally run out of money, having failed to raise another round because investors are too busy conjuring up the next bubble.&amp;nbsp;&amp;nbsp; The failure of Web 2.0 was a live demonstration in I-Told-You-So, as was the first bubble.&amp;nbsp; Both times, the world looked on and thought &quot;what the fuck are you doing?&quot;, and Silicon Valley replied &quot;shut up and bring me my Vaseline&quot;.&amp;nbsp; We went from bad business plans to no business plans, and saw much less liquidity this time.&amp;nbsp; The big bang was YouTube, and it was all down hill from there.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;The Only Easier Money is Marijuana&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;So what will the next bubble be?&amp;nbsp; Green technology.&amp;nbsp; Green energy.&amp;nbsp; Green computers.&amp;nbsp; Green pants.&amp;nbsp; Green vomit after an Absinthe adventure.&lt;br /&gt;&lt;br /&gt;Al Gore did a wonderful job creating awareness of global warming.&amp;nbsp; Awareness isn't the right word, but neither is hysteria.&amp;nbsp; Both are close enough.&lt;br /&gt;&lt;br /&gt;San Franciscans were more motivated than usual by this cause, and have begun to care about their carbon footprints or other such nonsense.&amp;nbsp; Making a San Franciscan feel like he alone can make a difference is the best way to control his actions.&amp;nbsp; See also: spending habits.&amp;nbsp; Al Gore, with his nonthreatening voice and relentless assault of data has the power to cultivate the same feeling in stay-at-home-moms and college students.&lt;br /&gt;&lt;br /&gt;Unfortunately, the average American mind can only be concerned with one crisis at a time.&amp;nbsp; Purveyors of fine doom-and-gloom are continuously vying for this spot.&amp;nbsp; Presently, it's the economy.&amp;nbsp; Foreclosures.&amp;nbsp; You're going to lose your house.&amp;nbsp; Oh fuck, you'll lose your house, your family, your car, and did we mention that you'll be living on the street?&amp;nbsp; Fear not.&amp;nbsp; Here's some shit you can buy to make it all better.&amp;nbsp; Here's a politician you can vote for who will fix everything.&lt;br /&gt;&lt;br /&gt;Fear cycles last a few years.&amp;nbsp; Remember when we were afraid of terrorism?&amp;nbsp; What about peak oil?&amp;nbsp; Global &lt;i&gt;cooling&lt;/i&gt; anyone?&amp;nbsp; When money comes back to the Valley, it's going to be aligned perfectly with the beginning of the next fear cycle, and the next fear cycle is going to be global warming.&amp;nbsp; Or climate change.&amp;nbsp; Or polar bear rescue.&amp;nbsp; You can call it whatever you like, as long as you spend money to fix it.&amp;nbsp; Do your part.&amp;nbsp; It's your obligation as a citizen of the earth.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Still Waiting For That Twitter Business Plan&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Green tech hasn't taken off yet because liberal guilt can't support a very big market.&amp;nbsp; What you need is government collusion.&amp;nbsp; You need somebody with a gun to step in and say that if you emit more than 100 tons of carbon per year, you need to pay.&amp;nbsp; You need that same person with a gun to say that these carbon emission credits have value, and can be traded.&amp;nbsp; It helps if your typical Silicon Valley entrepreneur or investor believes the call to action.&lt;br /&gt;&lt;br /&gt;That last part is easy.&amp;nbsp; Web 2.0 was all about San Francisco values.&amp;nbsp; Sharing.&amp;nbsp; Caring.&amp;nbsp; Understanding.&amp;nbsp; What would Web 3.0 be about? Many say it's some semantic bullshit.&amp;nbsp; Those are the same people who have figured out what &lt;a href=&quot;http://www.twine.com/&quot;&gt;Twine&lt;/a&gt; does (any hints?).&amp;nbsp; Whatever we can dream up to do over the internet won't draw any money; investors will be bored with web companies after this debacle.&amp;nbsp; The money will go to green tech, because there will be an obvious business plan, popular support, and a government mandate.&amp;nbsp; How can you lose?&amp;nbsp; &lt;br /&gt;&lt;br /&gt;The entrepreneurs will follow suit.&amp;nbsp; Silicon Valley types love to feel like they're making a difference, and green tech will practically let them fellate themselves. (In Web 2.0 the Silicon Valley types fellated one another, so this is the natural extension)&amp;nbsp; It will be different people, as an extensive knowledge of Python doesn't give you much insight into solar panel construction, but the same kind of people.&lt;br /&gt;&lt;br /&gt;I believe this because it's satisfying.&amp;nbsp; No more &quot;get users, do something, get bought out&quot;.&amp;nbsp; This time, it's &quot;invent something, build it, sell it&quot;.&amp;nbsp; Sure, we'll be turning a profit by taking sick advantage of alarmism, but it's a business.&amp;nbsp; &lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>Shut Your Face, Commons Httpclient</title>
    <link href="http://teddziuba.com/2008/12/shut-your-face-commons-httpcli.html"/>
    <updated>2008-12-18T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2008/12/shut-your-face-commons-httpcli</id>
    <content type="html">
      If you're like me and every other user on the planet, you don't give a shit when an SSL certificate doesn't validate.&amp;nbsp; Unfortunately, commons-httpclient was written by some pedantic fucknozzles who have never tried to fetch real-world webpages. &lt;br /&gt;&lt;br /&gt;If you want to turn off SSL certificate validation in httpclient, do this:&lt;br /&gt;&lt;br /&gt;1. Put &lt;a href=&quot;http://juliusdavies.ca/commons-ssl/download.html&quot;&gt;not-yet-commons-ssl.jar&lt;/a&gt; on your classpath.&lt;br /&gt;2. Execute the following method before you start any SSL connections:&lt;br /&gt;&lt;br /&gt;

      &lt;pre&gt;&lt;code&gt;
      public static void trustAllCerts() throws GeneralSecurityException, IOException {
      ProtocolSocketFactory sf = new EasySSLProtocolSocketFactory();
      Protocol p = new Protocol(&quot;https&quot;, sf, 443);
      Protocol.registerProtocol(&quot;https&quot;, p);
      }
      &lt;/code&gt;&lt;/pre&gt;

      This essentially makes commons-httpclient accept every SSL certificate it gets.&amp;nbsp; Yeah, that's what I thought.  Who's bitching now?
    </content>
  </entry>

  <entry>
    <title>Python Makes Me Nervous</title>
    <link href="http://teddziuba.com/2008/12/python-makes-me-nervous.html"/>
    <updated>2008-12-06T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2008/12/python-makes-me-nervous</id>
    <content type="html">
      &lt;img alt=&quot;wait-til-you-see-those-goddamn-bats.jpg&quot; src=&quot;/images/wait-til-you-see-those-goddamn-bats.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;238&quot; width=&quot;274&quot; /&gt; &lt;div&gt;The amount of time saved by using Python as opposed to something like Java is inversely proportional to the number of people working on the project.&lt;br /&gt;&lt;br /&gt;As a programmer in a team, you need rules.&amp;nbsp; You need structure.&amp;nbsp; You need order.&amp;nbsp; Freewheeling your way around a software project is going to create more problems than it solves.&lt;br /&gt;&lt;br /&gt;What I'm butthurting about here is Python's duck typing.&amp;nbsp; It's cute when you're a lone wolf working on a simple Django application, but add a few more people to the project and it quickly becomes unmanageable.&amp;nbsp; Why?&amp;nbsp; Because with duck typing, you need to keep &lt;b&gt;a lot&lt;/b&gt; more state in your head to interact with other peoples' code.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Pydev for Eclipse Sucks Too&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Method signatures are virtually useless in Python.&amp;nbsp; In Java, static typing makes the method signature into a recipe: it's all the shit you need to make this method work. Not so in Python.&amp;nbsp; Here, a method signature will only tell you one thing: how many arguments you need to make it work.&amp;nbsp; Sometimes, it won't even do that, if you start fucking around with **kwargs.&lt;br /&gt;&lt;br /&gt;Calling a colleague's method isn't as easy as looking at the signature.&amp;nbsp; You need to look at the method definition itself to see what it does with its input.&lt;br /&gt;&lt;br /&gt;Let's look at an example from Thrift, Facebook's open source RPC server.&amp;nbsp; Here's the signature to a TServer constructor in Java:&lt;br /&gt;

      &lt;pre&gt;&lt;code&gt;
      protected TServer(TProcessorFactory processorFactory, TServerTransport serverTransport)
      &lt;/code&gt;&lt;/pre&gt;

      And there are a few other constructors that take different args.&amp;nbsp; Pretty straight forward, if you look at this, you know what you need to instantiate to get your TServer up and running.&amp;nbsp; Now let's look at the Python version:&lt;br /&gt;
      &lt;pre&gt;&lt;code&gt;
      def __init__(self, *args):
      &lt;/code&gt;&lt;/pre&gt;

      So, how do you use it?&amp;nbsp; Big fuckin' mystery!&amp;nbsp; You can't overload constructors in Python, so they had to mash the several different constructors into one.&amp;nbsp; To figure out how to instantiate a TServer, you need to look at the constructor implementation.&amp;nbsp; &lt;i&gt;As a user of the library, the implementation is none of my concern, unless I'm programming in Python.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;What a waste of time.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Whatever You Do, Don't Do It Wrong&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;What about errors?&amp;nbsp; Python exceptions are what really make me nervous.&amp;nbsp; Your code can run fine for the longest time then shit out with a runtime exception.&amp;nbsp; How do you know what exceptions a method can throw?&amp;nbsp; Well, you don't, unless you look at the method definition.&amp;nbsp; Fantastic.&lt;br /&gt;&lt;br /&gt;Java has a well thought out hierarchy of checked and runtime exceptions.&amp;nbsp; Sure, handling checked exceptions means you need to write a bit more code, but it's better to spend the time in development than in debugging at 4am.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Another example is in order.&amp;nbsp; In Java, the constructor to FileInputStream throws a FileNotFoundException if something goes wrong.&amp;nbsp; Since it's a checked exception, you need to deal with it somehow.&amp;nbsp; The fact that this exception is thrown is made obvious in the documentation, and your code won't compile if you ignore it.&lt;br /&gt;&lt;br /&gt;Python, on the other hand, prefers to leave things up to chance.&amp;nbsp; This is the documentation for the open() builtin, that opens a file:&lt;br /&gt;&lt;br /&gt;
      &lt;pre&gt;Help on built-in function open in module __builtin__:

      open(...)
      open(name[, mode[, buffering]]) -&amp;gt; file object

      Open a file using the file() type, returns a file object.
      (END)
      &lt;/pre&gt;

      How does this function handle a failure?&amp;nbsp; Does it raise an Exception?&amp;nbsp; Does it return a special value?&amp;nbsp; Nobody seems to know!&amp;nbsp; Ah, fuck it, that's a runtime problem, right?&lt;br /&gt;&lt;br /&gt;Sure, runtime exceptions happen in Java, but they are usually things that are indicative of a &lt;b&gt;big&lt;/b&gt; fuckup like a NullPointerException, not something stupid like a file not being found.&lt;br /&gt;&lt;br /&gt;Programming a large project in Python makes me uneasy.&amp;nbsp; Perhaps I'm just doing it wrong?&amp;nbsp; Do other Pythonistas drop a Valium before they begin the day?&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Analog Debugging is Hard</title>
    <link href="http://teddziuba.com/2008/11/analog-debugging-is-hard.html"/>
    <updated>2008-11-24T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2008/11/analog-debugging-is-hard</id>
    <content type="html">
      I took a new job at a software company in Palo Alto, which means an 80 mile commute every day through Bay Area combat traffic.&amp;nbsp; The first two weeks wore hard on my motorcycle - a 14 year old Ninja 500.&amp;nbsp; Last week on my ride home, the left turn signal stopped working.&amp;nbsp; Fuck.&lt;br /&gt;&lt;br /&gt;If you thought debugging a software problem was hard, try debugging a hardware problem.&amp;nbsp; There are some salient facts about hardware problems that make them a real bitch:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;You need to buy a lot of tools.&lt;/li&gt;&lt;li&gt;There's a real possibility that you will fuck something up beyond repair.&lt;/li&gt;&lt;li&gt;There's a real possibility that you will injure yourself.&lt;/li&gt;&lt;li&gt;If it's your primary vehicle, you need to have it up and running on Monday.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;A blinker light stops working, which means electricity isn't flowing.&amp;nbsp; Sounds easy, but to get access to the wires, I needed to take the whole damn thing apart:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;close_up_small.JPG&quot; src=&quot;/images/close_up_small.JPG&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;410&quot; width=&quot;547&quot; /&gt;&lt;br /&gt; &lt;div&gt;You know how when you're writing software for a client, and they completely underestimate the amount of time and effort required to build something?&amp;nbsp; Yeah, the same goes for auto mechanics.&amp;nbsp; Don't bitch about a shop's $75/hr labor rate or their diagnosis fee.&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Avoiding NLP At All Costs</title>
    <link href="http://teddziuba.com/2008/11/avoiding-nlp-at-all-costs.html"/>
    <updated>2008-11-13T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2008/11/avoiding-nlp-at-all-costs</id>
    <content type="html">
      &lt;img alt=&quot;hurrdurr.gif&quot; src=&quot;/images/hurrdurr.gif&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;286&quot; width=&quot;317&quot; /&gt;I'm working with a startup now on a text summarization project.&amp;nbsp; The requirements are fairly loose: &quot;take all this text and make it smaller&quot;, solving the tl;dr problem (too long; didn't read).&amp;nbsp; There are a couple of critical details, namely identifying the sentiment of the text, and a few others that are excruciatingly domain-specific.&lt;br /&gt;&lt;br /&gt;At first glance, this seems approachable with some natural language processing libraries.&amp;nbsp; Oh no.&amp;nbsp; There be dragons.&amp;nbsp; At Pressflip, I had myself into a few NLP libraries, and the only takeaway I got from all that experience was &lt;i&gt;&quot;Don't use NLP.&amp;nbsp; Ever.&quot;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Why?&amp;nbsp; NLP is yucky.&amp;nbsp; It's complicated, the field is rife with academic shitheaddery, there are some major-asspain licensing issues with a couple of software packages and best of all, it's balls slow.&amp;nbsp; Plus, if you venture down the road of natural language processing, the law of diminishing returns will pull you into a dark alley, pummel you with a tire iron, take you wallet, and then just to be a prick, steal your shoes so you need to walk home barefoot.&lt;br /&gt;&lt;br /&gt;My point is, for 99 practical projects out of 100, you can cheat your way out of NLP.&amp;nbsp; Cook up some fancy shit with word frequencies and logarithms.&amp;nbsp; Reach back into your information retrieval notes for inspiration.&amp;nbsp; TF*IDF can take you a long way if you know how to use it.&lt;br /&gt;&lt;br /&gt;When I was brainstorming the project I'm working on, my first thought was some hand-waving business about a part-of-speech tagger and a Markov Chain to figure out probabilities of part-of-speech transitions and all that fancy shit.&amp;nbsp; Factor in a little bit of sentiment detection from God-knows-where and that was my sketch.&amp;nbsp; Then practicality set in: how much time do you want to spend on this?&amp;nbsp;&lt;b&gt; If you are considering NLP as the answer to a real problem, it's virtually certain that you're overthinking it&lt;/b&gt;. &lt;br /&gt;&lt;br /&gt;That being said, NLP does have its place: &lt;a href=&quot;http://www.powerset.com/&quot;&gt;making the best fucking Wikipedia search engine there ever was with technology licensed from Xerox and then selling yourself to Microsoft&lt;/a&gt;.&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>I Have Never Seen Ubuntu Get Upgrades Right</title>
    <link href="http://teddziuba.com/2008/11/i-have-never-seen-ubuntu-get-u.html"/>
    <updated>2008-11-01T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/11/i-have-never-seen-ubuntu-get-u</id>
    <content type="html">
      Ubuntu upgrades always, always, &lt;i&gt;always&lt;/i&gt; fuck up the same things: network connections and graphics drivers.&amp;nbsp; Without fail, if you upgrade, your wireless connection won't work and any closed-source video card drivers you need will get ill.&amp;nbsp; 8.10 Intrepid Ibex is no exception.&lt;br /&gt;&lt;br /&gt;I figured out how to get my wireless connection back, but the NVidia drivers are still a mystery.&amp;nbsp; I don't care about free software idealism, I care that my shit &lt;i&gt;works&lt;/i&gt;.&amp;nbsp; I'm willing to jump through minor hoops to make it work, like Ubuntu's &quot;restricted drivers&quot; lecture, but now that doesn't even work:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;intrepid-fucked.png&quot; src=&quot;/images/intrepid-fucked.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;573&quot; width=&quot;505&quot; /&gt;&lt;br /&gt;&lt;br /&gt;Why is this such a pain in the balls?&amp;nbsp; I press the activate button and nothing happens. AWESOME. Just because I'm used to wasting hours fixing things like this doesn't mean I enjoy it.&lt;br /&gt;&lt;br /&gt;NVidia has 70-ish percent market share of all GPUs.&amp;nbsp; If your shit doesn't work out of the box on 70% of the graphics cards out there, who has failed?&lt;br /&gt;&lt;br /&gt;&lt;b&gt;[Update, the next day]:&lt;/b&gt;&amp;nbsp; I found the fix.&amp;nbsp; Navigate to System &amp;gt; Administration &amp;gt; Synaptic Package Manager.&amp;nbsp; From there, go to Settings &amp;gt; Repositories.&amp;nbsp; In the &quot;Ubuntu Software&quot; tab, check the &quot;Proprietary device drivers&quot; box.&amp;nbsp; Or edit /etc/apt/sources.list if you want to show your chest hair.&lt;br /&gt;&lt;br /&gt;I'm glad to see that passive-aggressive Debian superiority is alive and well.&amp;nbsp; &lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>Auto Mechanics: A Good Hobby for Programmers</title>
    <link href="http://teddziuba.com/2008/10/auto-mechanics-a-good-hobby-fo.html"/>
    <updated>2008-10-19T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/10/auto-mechanics-a-good-hobby-fo</id>
    <content type="html">
      &lt;img alt=&quot;i-disagree.jpg&quot; src=&quot;/images/i-disagree.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;230&quot; width=&quot;307&quot; /&gt; &lt;div&gt;When I was a kid, I used to help my Dad work on the family cars.&amp;nbsp; We changed the oil, brake pads, repaired a broken hydraulic line, and fixed faulty air conditioning.&lt;br /&gt;&lt;br /&gt;It wasn't too long before I was able to do most basic repair and maintenance myself.&amp;nbsp; In college, I spent hours fixing an electrical problem that caused my rear turn signals to go out.&lt;br /&gt;&lt;br /&gt;Recently, my car started to make a weird &quot;coughing&quot; noise from the muffler, and I fixed that, too.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;This Is Going Somewhere I Swear&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Why is this relevant to programming?&amp;nbsp; Well, one of my favorite parts about programming production services is forensic debugging.&amp;nbsp; Some process crashes in the middle of the night and all you're left with is a beeper going apeshit and a stack trace.&amp;nbsp; What went wrong?&amp;nbsp; How do you debug that?&lt;br /&gt;&lt;br /&gt;Debugging a car is the same thing, but a lot harder.&amp;nbsp; With a car, the symptoms of the bug aren't usually very concrete: a funny noise, a bad smell, a jittery feeling.&amp;nbsp; Compared to a computer, a car is a very simple machine, but because it's so simple it's much harder to debug.&amp;nbsp; Newer cars have an electronic interface to tell you what sensors are indicating faults, but that doesn't always solve the problem.&amp;nbsp; Plus, the sensor readers are like $300.&lt;br /&gt;&lt;br /&gt;The more you work on a car, the more you develop an intuition about it.&amp;nbsp; In code, you can narrow your bug down and fix it.&amp;nbsp; With a car, you narrow the fault down to a couple of suspect parts and start by replacing the cheapest one.&amp;nbsp; For me, fixing a car problem is much more gratifying than fixing a code problem because of the tangibility of it.&lt;br /&gt;&lt;br /&gt;So, if you're enjoy debugging and problem solving, you'd probably like auto mechanics.&amp;nbsp; There are a couple of collateral upshots to it: you save some money, you can give your friends car advice, and you get to buy a bunch of really awesome tools.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;No one can pull your man card when you have a specialized wrench for an O2 sensor.&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Most Programming Interviews are a Waste of Time</title>
    <link href="http://teddziuba.com/2008/10/most-programming-interviews-ar.html"/>
    <updated>2008-10-07T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/10/most-programming-interviews-ar</id>
    <content type="html">
      &lt;img alt=&quot;dr-seuss-wtf-is-this-shit.jpg&quot; src=&quot;/images/dr-seuss-wtf-is-this-shit.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;266&quot; width=&quot;191&quot; /&gt; &lt;div&gt;Interviewing a candidate is so much fun because you get to passively assert your superiority &lt;i&gt;and&lt;/i&gt; be professorial enough that you can justify those nine years you spent in graduate school studying compiler optimizations only to get a job maintaining a failure-prone database driven web app.&lt;br /&gt;&lt;br /&gt;Interviewers spend almost as much time Googling for interview questions as candidates do.&lt;br /&gt;&lt;br /&gt;I've been on both sides of the interview, and I'm here to dump a big load of truth on you about what interviewers really think.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Technical Question&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Says:&lt;/i&gt; How would you find a cycle in a singly linked list?&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Thinks:&lt;/i&gt; This job has nothing to do with linked lists.&amp;nbsp; In fact, I don't think anyone has used a singly liked list since the seventies.&amp;nbsp; I wonder if you're good at PHP and MySQL, because that's what all the work is here, but I'm not going to ask you anything about actual job requirements, because that doesn't afford me the opportunity to be pathologically pedantic.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;The Follow-Up&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Says:&lt;/i&gt; And how would you refine your solution to use O(n) time and O(1) space?&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Thinks:&lt;/i&gt; I haven't actually solved this problem myself, nor have I formally proven the &quot;right&quot; answer to be correct. I have done no preparation for this interview beyond looking on the internet for programming interview questions.&amp;nbsp; I'm basically dead wood in this organization, and I pray every day that nobody figures this out.&amp;nbsp; As such, if you come up with the answer quickly, I'll either think you cheated and looked up programming interview questions on the internet, &lt;i&gt;or&lt;/i&gt; you're genuinely smart enough to expose my own uselessness should you get hired.&amp;nbsp; In either case, your best course of action here is to pretend like you don't know and let me explain the correct answer with a shit-eating grin on my face.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;The Bullshit&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Says:&lt;/i&gt; Where do you see yourself in five years?&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Thinks:&lt;/i&gt; I have no idea what I'm doing.&amp;nbsp; I really need to keep this room from going dead-air, so I'll give you something to talk about.&amp;nbsp; Just start talking.&amp;nbsp; I really don't care, say anything.&amp;nbsp; I'm not listening.&amp;nbsp; I'm using this moment to think about the woman working in HR that I want to bone, but wants nothing to do with me because I'm an introverted nerd who will never work up the sack to ask her out.&amp;nbsp; Fuck my life.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;Finally It's Over&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Says:&lt;/i&gt; Do you have any questions for me?&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Thinks:&lt;/i&gt; We're 35 minutes through a 45 minute interview.&amp;nbsp; If this doesn't take up ten minutes, I can blame ending the interview early on my clock being fast.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;So What Is A Good Interview?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;For once, I'd like to have a 45 minute candid conversation with an interviewee.&amp;nbsp; Talk about interests, shoot the breeze, and get a general idea of the guy's technical aptitude and fit with the company.&amp;nbsp; Talk about projects, see if the guy gets animated.&amp;nbsp; Combine that with some pre-submitted code samples, and you can get a genuine idea of how suited the candidate is.&lt;br /&gt;&lt;br /&gt;I don't know about you, but I would not want to work for or with somebody who is this passive-aggressive.&lt;br /&gt;&lt;br /&gt;Asking pedantic and useless questions like this is just a waste of everyone's time.&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Java subList Gotcha</title>
    <link href="http://teddziuba.com/2008/09/java-sublist-gotcha.html"/>
    <updated>2008-09-13T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/09/java-sublist-gotcha</id>
    <content type="html">
      &lt;img alt=&quot;yogi.jpg&quot; src=&quot;/images/yogi.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;240&quot; width=&quot;320&quot; /&gt; &lt;div&gt;Don't ever try to do anything sneaky when you're programming.&amp;nbsp; It will always bite you in the ass.&amp;nbsp; If you still want to be sneaky, read the documentation.&lt;br /&gt;&lt;br /&gt;Last week, we had a problem with one of our processes hanging and burning 100% CPU.&amp;nbsp; The first time it happened we chalked it up to mysteries of the universe and restarted the process (a time-honored startup tradition), but the second time, I actually got off my ass and investigated.&lt;br /&gt;&lt;br /&gt;Through the miracle of &lt;code&gt;jstack&lt;/code&gt;, I could look at the stack trace of a currently running Java process.&amp;nbsp; This is what I found:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;

      &lt;pre&gt;&quot;main&quot; prio=10 tid=0x0a030800 nid=0x10a6 runnable [0xb7d7d000..0xb7d82218]
      java.lang.Thread.State: RUNNABLE
      at java.util.SubList$1.nextIndex(AbstractList.java:713)
      at java.util.SubList$1.nextIndex(AbstractList.java:713)

      ...snip... about 100 lines

      at java.util.SubList$1.hasNext(AbstractList.java:691)
      at java.util.SubList$1.next(AbstractList.java:695)
      at java.util.SubList$1.next(AbstractList.java:696)

      ...snip... about 100 lines

      at com.pressflip.pipeline.standard.deduper.ShingleDupeDetector.&lt;br /&gt;        dedupBatch(ShingleDupeDetector.java:139)
      at com.pressflip.pipeline.standard.deduper.DeduperPipelineStep.&lt;br /&gt;        innerProcess(DeduperPipelineStep.java:115)

      ... and right down to the main() from here.&lt;br /&gt;&lt;/pre&gt;

      &lt;p&gt;The suspect line in all this is ShingleDupeDetector.java:139, which is one of those &lt;i&gt;how-the-hell-are-you-hanging-on-this&lt;/i&gt; lines:&lt;/p&gt;

      &lt;pre&gt;for (Integer x : someCollectionOfIntegers) {
      &lt;/pre&gt;

      &lt;p&gt;So what the shit, right?&lt;/p&gt;&lt;p&gt;I was using this collection as a cache of sorts, where on every run, I chopped some data off the front of it and added some data to the back, keeping the collection size constant.&amp;nbsp; To accomplish this, I used the &lt;code&gt;subList&lt;/code&gt; method on &lt;code&gt;java.util.List&lt;/code&gt;, something like this:&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
      &lt;pre&gt;someCollectionOfIntegers = someCollectionOfIntegers.subList(fromIndex,
      someCollectionOfIntegers.size());
      someCollectionOfIntegers.addAll(incoming);
      &lt;/pre&gt;

      &lt;p&gt;Well it turns out that &lt;code&gt;subList&lt;/code&gt; didn't do what I thought it did.&amp;nbsp; I assumed that I just got a new &lt;code&gt;List&lt;/code&gt; that contained the elements in the given range of the original.&amp;nbsp; Oh no, &lt;code&gt;subList&lt;/code&gt; returns a &lt;i&gt;view&lt;/i&gt; of the original list where only elements in the given range are addressable.&amp;nbsp; A look at &lt;code&gt;AbstractList.java&lt;/code&gt;'s source reveals this:&lt;/p&gt;

      &lt;pre&gt;
      public List&amp;lt;E&amp;gt; subList(int fromIndex, int toIndex) {
      return new SubList&amp;lt;E&amp;gt;(this, fromIndex, toIndex);
      }
      &lt;/pre&gt;

      &lt;p&gt;And the &lt;code&gt;SubList&lt;/code&gt; object keeps a reference to &lt;code&gt;this&lt;/code&gt;, as well as an offset to know where iteration starts, so as I updated the &quot;cache&quot;, iterating over it became recursive.&amp;nbsp; Oh, balls.&amp;nbsp; That's why it's running slow.&lt;br /&gt;&lt;/p&gt;
    </content>
  </entry>

  <entry>
    <title>A Web OS?  Are You Dense?</title>
    <link href="http://teddziuba.com/2008/09/a-web-os-are-you-dense.html"/>
    <updated>2008-09-06T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/09/a-web-os-are-you-dense</id>
    <content type="html">
      People are calling Google Chrome a &quot;Web Operating System&quot; and a &quot;Cloud Operating System&quot;.&amp;nbsp; Some are even calling it a Windows killer.&lt;br /&gt;&lt;br /&gt;I think it's time to nip this horseshit in the bud, before it gets out of hand.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;How Does Arringtons Know What Operating Systems Is?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;He doesn't.&amp;nbsp; It is TechCrunch's official position that Google Chrome will compete full on with Microsoft Windows, and computers will be sold with Chrome only, having the Windows layer &quot;stripped out&quot;.&amp;nbsp; I am not shitting you, &lt;a href=&quot;http://www.techcrunch.com/2008/09/01/meet-chrome-googles-windows-killer/&quot;&gt;he actually said that&lt;/a&gt;.&amp;nbsp; Yeah, I get where the argument is going about web apps being more dominant than desktop apps.&amp;nbsp; That prediction is a crock of shit.&amp;nbsp; A &lt;a href=&quot;http://www.techcrunch.com/2007/12/18/majority-of-americans-on-google-docs-what-you-talkin-bout-willis/&quot;&gt;2007 survey&lt;/a&gt; found that 73% of Americans have never even &lt;i&gt;heard&lt;/i&gt; of Google Docs, and 94% have never tried an online office suite.&amp;nbsp; Yeah, desktop apps aren't going anywhere.&lt;br /&gt;&lt;br /&gt;But I'm not here to talk shit on Web 2.0 today.&amp;nbsp; I'm going to present a glimpse of the hole that the incompetent programmers are digging for us.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;When Times Were Simple&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Let's have a look at the application stack that we all know and love: programs compiled to run in an environment with a C library.&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;normal-cropped.gif&quot; src=&quot;/images/normal-cropped.gif&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;245&quot; width=&quot;520&quot; /&gt;&lt;br /&gt;&lt;br /&gt;Fuck me, life is good.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Making It Easier On Programmers&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I first learned to program in C++ and then later on I learned Java in college.&amp;nbsp; I thought the whole Java Runtime Environment thing was kind of weak, but if it means I don't have to manage memory, that's cool.&amp;nbsp; Same goes for Python, Ruby, and whatever else has its own VM or interpreter.&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;runtime-cropped.gif&quot; src=&quot;/images/runtime-cropped.gif&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;290&quot; width=&quot;527&quot; /&gt;&lt;br /&gt;&lt;br /&gt;This situation is pretty agreeable, and lets us prototype applications rapidly.&amp;nbsp; Sure, there's a small trade-off with execution speed, but they have multi-gigahertz processors nowadays.&amp;nbsp; No big deal.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Making It Easier On Idiots&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;After a while, everybody wanted to be a programmer.&amp;nbsp; Since programming is actually kind of hard, many of these folk landed in PHP and HTML, hence the explosion of webapps.&amp;nbsp; As such, the browser became a feeble example of a &quot;runtime&quot;.&lt;br /&gt;&lt;br /&gt;Now, with Google Chrome being lauded as a Web Operating System, the stack gets way bigger.&amp;nbsp; This is what it looks like on my computer, considering I run Linux and Google hasn't released their Operating System for the Linux Operating System (that makes sense, doesn't it?)&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;chrome-os2.gif&quot; src=&quot;/images/chrome-os2.gif&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;480&quot; width=&quot;640&quot; /&gt;&lt;br /&gt;&lt;br /&gt;Users have pretty basic needs when it comes to computers.&amp;nbsp; They want word processing, spreadsheets, communications, and games.&amp;nbsp; These needs have not changed much since the advent of the personal computer.&amp;nbsp; So, when your Aunt asks why her 1.2GHz computer isn't fast enough to run an online word processor that has the same fucking features as the 1987 version of Corel WordPerfect, you don't have an answer for her.&amp;nbsp; There is no justification.&lt;br /&gt;&lt;br /&gt;The &quot;Web Operating System&quot; just highlights how much journalists don't know about computers. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Porter Stemming Makes Me Rage</title>
    <link href="http://teddziuba.com/2008/07/porter-stemming-makes-me-rage.html"/>
    <updated>2008-07-23T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/07/porter-stemming-makes-me-rage</id>
    <content type="html">
      &lt;img alt=&quot;not_illegal_in_thailand.jpg&quot; src=&quot;/images/not_illegal_in_thailand.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;267&quot; width=&quot;334&quot; /&gt; &lt;div&gt;I have no formal training in natural language processing.&amp;nbsp; As such, I figure out a lot of this shit on my own.&lt;br /&gt;&lt;br /&gt;One of the simplest concepts in NLP/text mining is stemming.&amp;nbsp; If you're not in the know, to stem a word is to remove all the unnecessary shit after its root.&lt;br /&gt;&lt;br /&gt;For example, &quot;computer&quot;, &quot;computing&quot; and &quot;compute&quot; all stem to &quot;comput&quot;.&amp;nbsp; Same root, virtually the same meaning.&lt;br /&gt;&lt;br /&gt;Something like this is clearly useful in a search engine like Pressflip, because if somebody searches for &quot;iphone&quot; (and a &lt;i&gt;lot&lt;/i&gt; of you people are), the engine should pull up documents that contain the plural (iphones) of the word.&lt;br /&gt;&lt;br /&gt;The canonical algorithm for doing this sort of thing is called the Porter Stemming Algorithm, which considers each word on its own.&amp;nbsp; Porter works great 99% of the time, but when it fails, it fucks you &lt;i&gt;hard&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Why You Keep Tryin To Say That Word?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;A good example of this comes from the pressflip query logs.&amp;nbsp; A user searched for &quot;marketing&quot;.&amp;nbsp; Perfectly reasonable.&amp;nbsp; Porter stemmed that to &quot;market&quot;, which returned a bunch of search results about the Dow Jones and Nasdaq.&amp;nbsp; Ouch. Right in the butt.&lt;br /&gt;&lt;br /&gt;What went wrong?&amp;nbsp; In smart-talk, the bare infinitive that corresponds to the gerund has a different meaning than the gerund.&amp;nbsp; Again, I know dick-shit about NLP, so maybe you guys have a serious-business name for this sort of thing.&lt;br /&gt;&lt;br /&gt;So yeah, gerunds make Porter suck sometimes.&lt;br /&gt;&lt;br /&gt;There are some other failure cases I've discovered.&amp;nbsp; Proper nouns will give it to you Clydesdale-style, too.&amp;nbsp; More specifically, proper nouns that don't stem to themselves.&amp;nbsp; Example: &quot;Mariners&quot; and &quot;Marin&quot; both share the same stem.&amp;nbsp; So potentially, someone searching for the baseball team from Seattle will come up with news about the hoity-toity town across the Golden Gate Bridge from San Francisco.&lt;br /&gt;&lt;br /&gt;What's the answer to this?&amp;nbsp; If you're a company with millions in VC lottery winnings, you can pay Basistech $100,000 for a 3-year license of their context sensitive stemmer.&amp;nbsp; If you're me, though, you make exclusion lists.&amp;nbsp; Big ones.&lt;br /&gt;&lt;br /&gt;That being said, after a large re-processing this weekend, Pressflip search quality is going to improve.&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Build Google Protocol Buffers Without Maven</title>
    <link href="http://teddziuba.com/2008/07/build-google-protocol-buffers.html"/>
    <updated>2008-07-07T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/07/build-google-protocol-buffers</id>
    <content type="html">
      &lt;img alt=&quot;trippin_balls.jpg&quot; src=&quot;/images/trippin_balls.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;208&quot; width=&quot;225&quot; /&gt;Google released &lt;a href=&quot;http://code.google.com/p/protobuf/&quot;&gt;protocol buffers&lt;/a&gt; as open source, which, with a proper transport, will give both XML-RPC and &lt;a href=&quot;http://developers.facebook.com/thrift/&quot;&gt;Thrift&lt;/a&gt; a run for their money.&lt;br /&gt;&lt;br /&gt;Anyway, it's kind of a pain in the balls to build the Java version.&amp;nbsp; If you import the Java source into Eclipse, it's got all sorts of build errors, all stemming from a missing file: &lt;code&gt;DescriptorProtos.java&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;If you've got Maven installed, it will make &lt;code&gt;DescriptorProtos.java&lt;/code&gt; for you (this file is generated via &lt;code&gt;protoc&lt;/code&gt;).&amp;nbsp; But Maven is stupid, because it didn't work immediately after &lt;code&gt;apt-get install&lt;/code&gt; and I couldn't figure out how to fix it within 30 seconds.&amp;nbsp; I have no patience for this kind of bullshit.&lt;br /&gt;&lt;br /&gt;So, to build &lt;code&gt;DescriptorProtos.java&lt;/code&gt; without Maven, you make it by hand:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;protoc --java_out=/home/ted/some_directory \ &lt;br /&gt;/path/to/protobufsrc/src/google/protobuf/descriptor.proto&lt;/pre&gt;(You already compiled &lt;code&gt;protoc&lt;/code&gt;, didn't you?)&lt;br /&gt;&lt;br /&gt;Drop the output file into Eclipse and protocol buffers will build.&amp;nbsp; There are still a bunch of compilation warnings, but only chumps listen to those.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Corporate Competence</title>
    <link href="http://teddziuba.com/2008/07/corporate-competence.html"/>
    <updated>2008-07-04T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/07/corporate-competence</id>
    <content type="html">
      &lt;img alt=&quot;1213940897512.jpg&quot; src=&quot;/images/1213940897512.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;360&quot; width=&quot;328&quot; /&gt; &lt;div&gt;I really love it when people &lt;i&gt;just do their jobs&lt;/i&gt;.&amp;nbsp; I feel gifted whenever I call a company and get a customer support representative who know what they are doing and actually cares about me.&lt;br /&gt;&lt;br /&gt;It's rare, but it happens.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;Worst ISP Ever&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;For a while, I had Comcast's cable internet service.&amp;nbsp; It was clear after two years of putting up with their horseshit that they don't care about customers at all.&lt;br /&gt;&lt;br /&gt;Oh, wait, they set up a &lt;a href=&quot;http://twitter.com/comcastcares&quot;&gt;Twitter account&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Fantastic, but my BitTorrent shit still didn't work on their network.&amp;nbsp; Their installation staff is rude and has questionable hygeine, and their customer support representatives are downright lazy.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Switch to AT&amp;amp;T Now&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;When I moved, my first order of business was to call Comcast and tell them it's over.&amp;nbsp; They said my service wouldn't end until I brought back my cable modem, and of course, the place I need to bring it back to is only open during working hours.&lt;br /&gt;&lt;br /&gt;I took off work early to get this little brick of dissatisfaction back to its rightful owner, because fuck them.&lt;br /&gt;&lt;br /&gt;At the same time, I was waiting for AT&amp;amp;T to show up and install U-Verse internet service.&amp;nbsp; They did, and shit was &lt;i&gt;impressive&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;They told me the tech would be at my house any time from noon to 2pm on a Sunday.&amp;nbsp; The tech showed up at noon on the dot.&lt;/li&gt;&lt;li&gt;It took him about an hour to set up the service.&amp;nbsp; When he left, he gave me a card with his direct cell phone number.&amp;nbsp; If I had any problem in the next ten days, I called him directly and he would come fix it.&lt;/li&gt;&lt;li&gt;An hour after he left, the service went out.&amp;nbsp; I called him, and he was back at my house within 30 minutes.&amp;nbsp; It turns out there was something wrong with the line from the street to my house, and he had to get &lt;i&gt;another&lt;/i&gt; tech out to fix it.&amp;nbsp; That guy showed up, fixed the problem, and was on his way.&amp;nbsp; The two of them were at my place until 8pm on a Sunday until the job was done right.&lt;/li&gt;&lt;/ul&gt;I've been using the service for almost a week now and it's great.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;No BitTorrent fuckery.&amp;nbsp; All my torrents work great, and I can seed.&lt;/li&gt;&lt;li&gt;10 megabits downstream, 1.5 megabits upstream.&lt;/li&gt;&lt;/ul&gt;Great job, AT&amp;amp;T, you actually care about the people paying your salaries.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Practical Unique Identifiers</title>
    <link href="http://teddziuba.com/2008/07/practical-unique-identifiers.html"/>
    <updated>2008-07-01T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/07/practical-unique-identifiers</id>
    <content type="html">
      &lt;img alt=&quot;dogs_love_md5.jpg&quot; src=&quot;/images/dogs_love_md5.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;230&quot; width=&quot;307&quot; /&gt;There have been a handful of places within the Persai pipeline where I have needed unique identifiers of varying length.&amp;nbsp; 64 bits here, 32 bits there.&amp;nbsp; I'm not the only one to ever have to solve this problem, but I could never find a concise toolbox of information on it.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;Automatic Increment or Not&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;MySQL has the AUTO_INCREMENT modifier for integral record keys.&amp;nbsp; That's great, if you're using MySQL.&amp;nbsp; In general, prefer a non-automatically increasing record identifier, unless you have a specific reason.&amp;nbsp; Here's why:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;You may actually have to think about thread synchronization at some point when creating records.&lt;/li&gt;&lt;li&gt;If these identifiers become publicly visible, they can leak information about how many records are in your database.&lt;/li&gt;&lt;li&gt;If you make identifiers out of other pieces of data (say URLs), then you can't get the identifier value of a given datum without a table lookup.&amp;nbsp; And even then, you'll need another index on &lt;i&gt;that&lt;/i&gt; field.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;There are a few cases where automatic increment identifiers are good, though:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;You are using a MySQL database and are setting up a simple structure of tables. (i.e. MySQL handles synchronization for you and it's actually harder to &lt;i&gt;not&lt;/i&gt; use automatic increment)&lt;/li&gt;&lt;li&gt;The creation order of records is really important to you, but not important enough to store a timestamp field.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;b&gt;Making an Identifier Out Of Arbitrary Data&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Easy, right?&amp;nbsp; Just hash whatever data you've got.&amp;nbsp; It's not reversible and spread uniformly over the identifier space.&amp;nbsp; However, many times the output of a standard hashing algorithm is too big.&amp;nbsp; SHA-1, for example, is 160 bits wide.&amp;nbsp; Way too long for most purposes.&lt;br /&gt;&lt;br /&gt;In this case, I truncate the output.&amp;nbsp; Yes, this is mathematically valid, because any good hashing algorithm's output will be uniform over the range of the function.&amp;nbsp; And by uniform, I mean really uniform.&amp;nbsp; For example, if you take the first 64 bits of a 160-bit SHA-1 hash and call that your unique identifier, the probability of a collision is going to be uniform over the space of all 64-bit numbers.&amp;nbsp; If it wasn't (i.e. the first 64-bits of a SHA-1 hash were distributed, say normally), then the hash function would be cryptographically insecure.&lt;br /&gt;&lt;br /&gt;Don't try to swing your dick around and come up with your own hash function.&amp;nbsp; You'll screw it up.&amp;nbsp; I know I have.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;
      GUIDs&lt;/b&gt;&lt;br /&gt;
      &lt;br /&gt;
      I got an e-mail from a reader about using GUIDs for unique identifiers.&amp;nbsp; This fits with the hashing scheme, but for the most part, I think GUIDs are far too large, especially if you are storing a lot of records.&amp;nbsp; GUIDs are 128 bits wide, so if you have a hundred million records, that's about 1.5GB worth of identifiers.&amp;nbsp; Use a 64-bit identifier, and your space is halved, without a significant increase in collision probability.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Making An Identifier Easier On The Eyes&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you need to put a unique identifier in a URL, it can't look too nerdy.&amp;nbsp; For example, this URL looks like shit:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;code&gt;http://www.website.com/document?id=1b25a53bf21d0206&lt;/code&gt;&lt;br /&gt;&lt;/blockquote&gt;Too many numbers.&amp;nbsp; So, to make it look better, Base-64 encode it.&amp;nbsp; It will lengthen the code a little, but it's much easier to look at:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;code&gt;http://www.website.com/document?id=ZnJvc3RlZCBidXR0cw==&lt;/code&gt;&lt;br /&gt;&lt;/blockquote&gt;Eh, well it looks better to me.&amp;nbsp; Personal taste, I guess.&lt;br /&gt;&lt;br /&gt;You'll need to make sure that your Base-64 alphabet doesn't include the + and / characters: they aren't URL safe.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Sort Orderings&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Don't worry about sort ordering unless you have to worry about sort ordering.&amp;nbsp; Duh.&amp;nbsp; The vast majority of Persai's data is stored simply as files, and for most purposes we don't have to care about the processing order.&amp;nbsp; We're fortunate in that regard (well maybe not &lt;i&gt;fortunate&lt;/i&gt;, I mean that's like saying you're &lt;i&gt;fortunate&lt;/i&gt; that you're not fat because you exercise and eat sensibly).&lt;br /&gt;&lt;br /&gt;Anyway, there are a couple of places in Persai where sort order matters.&amp;nbsp; The ordering of recommendations, for example.&amp;nbsp; There, though, we're just ordering by time, and we need to display the exact time, not just the relative times of the recommendations, so we store a date field and order data by it in the store.&lt;br /&gt;&lt;br /&gt;This drives one of my earlier points home: &lt;i&gt;if you need ordering by time, don't count on an automatic increment unique identifier to do it&lt;/i&gt;.&amp;nbsp; It's much more robust to store a timestamp.&lt;br /&gt;&lt;br /&gt;In fact this point goes deeper.&amp;nbsp; Very rarely do you actually need records sorted by record identifier.&amp;nbsp; What you need is the records sorted by some other value that happens to be reflected in the record identifier by virtue of automatic increment and the insertion order.&amp;nbsp; It's always more robust to store the actual value you need to sort by.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;I'm Not Going To Tell You How To Write Code&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Because I don't really care.&amp;nbsp; This is how I do it, though.&lt;br /&gt; &lt;div&gt;&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>An Engineer's Guide To Weight Loss</title>
    <link href="http://teddziuba.com/2008/05/an-engineers-guide-to-weight-l.html"/>
    <updated>2008-05-20T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/05/an-engineers-guide-to-weight-l</id>
    <content type="html">
      &lt;img alt=&quot;1208418061380.jpg&quot; src=&quot;/images/1208418061380.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;249&quot; width=&quot;200&quot; /&gt;After a year of the Google cafeteria and another year of eating low cost Startup FounderChow, I put on a few pounds.&amp;nbsp; Now, I'm starting to shed them, and&amp;nbsp; I'll tell you how.&lt;br /&gt;&lt;br /&gt;Before I get into it, I want to lay down a few prerequisites.&amp;nbsp; There are a lot of diet guides out there that will bullshit you into thinking that the process is easy.&amp;nbsp; This is a lie.&amp;nbsp; &lt;b&gt;Dieting and exercising suck.&amp;nbsp; This is possibly the most miserable thing you can do to yourself.&lt;/b&gt;&amp;nbsp; You are not going to have fun.&lt;br /&gt;&lt;br /&gt;To that end, if you are more than 50 pounds overweight, are unmarried, have no children, and your only reason to get up in the morning is your shitty software job, the healthy lifestyle is not for you.&amp;nbsp; You are better off eating yourself to the grave: you will get much more satisfaction out of life by eating cheeseburgers than you will by torturing the pounds of fat off your gut.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Simple I/O Operation&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The science of weight loss is simple: eat fewer calories than your burn.&amp;nbsp; You have heard this before, I trust.&amp;nbsp; To follow this principle, you need to start quantifying.&amp;nbsp; I use a web service called &lt;a href=&quot;http://www.fitday.com/&quot;&gt;FitDay&lt;/a&gt; to track the calories I eat versus the calories I burn.&lt;br /&gt;&lt;br /&gt;Start by running a 1,000 calorie per day deficit.&amp;nbsp; To lose a pound of fat, you need to burn around 3,500 calories, so you'll lose two pounds in a week.&amp;nbsp; Just to be clear, &lt;b&gt;doing this sucks ass&lt;/b&gt;.&amp;nbsp; However, there are a few easy ways to trim calories here and there.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Go Easy On The Drinking&lt;/i&gt;&lt;br /&gt;When I go out on a Friday night to have a few beers, it's not hard for me to consume 800 calories worth of booze.&amp;nbsp; Yes, liquor helps to numb the pain of writing XML parsers all day, but it comes at an expense.&amp;nbsp; To compensate, take up smoking.&amp;nbsp; I smoke more cigars now: it's a good zero-calorie alternative.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Eat One Serving&lt;/i&gt;&lt;br /&gt;There are eight servings in a box of Barilla pasta.&amp;nbsp; I used to eat half a box of pasta in a single sitting, 4 servings worth.&amp;nbsp; Since you're counting your calories anyhow, you'll already be monitoring servings.&amp;nbsp; You will also spend less money this way: since I started counting my calories, I've been spending 50% less per week on food.&amp;nbsp; More money for cigars.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Drink More Coffee&lt;/i&gt;&lt;br /&gt;Caffeine is an appetite suppressant.&amp;nbsp; In large enough quantities, it can be used as an amphetamine.&amp;nbsp; Drink up.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Ice Cream Keeps You Sane&lt;/i&gt;&lt;br /&gt;Low-fat ice cream has around 120 calories per half cup.&amp;nbsp; Fat calories keep you feeling satiated for longer.&amp;nbsp; The Dreyer's brand (sold as Edy's on the east coast) doesn't suck that much.&lt;br /&gt;&lt;br /&gt;After you have been limiting your calorie intake for two weeks, your stomach will shrink enough that it takes significantly less food to satisfy you.&amp;nbsp; So that's step one: stop eating so damn much.&amp;nbsp; Step two is exercise.&amp;nbsp; And yes, it's awful.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;See What Condition My Condition Was In&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The only real &lt;i&gt;benefit&lt;/i&gt; to exercise is being able to hold your nose over people who don't exercise.&amp;nbsp; That's pretty cool if you're looking to take your misery out on your co-workers.&amp;nbsp; Protip: it's better for your general well-being to be a prick to your colleagues than to your family.&lt;br /&gt;&lt;br /&gt;You will lose more weight by dieting than by exercise if you are eating 1,000 fewer calories per day than you burn by doing nothing, so use exercise only as a supplement to your calorie loss.&lt;br /&gt;&lt;br /&gt;If you're going to exercise, use an elliptical machine.&amp;nbsp; Treadmills are terrible: they make you run.&amp;nbsp; If you're like me, you have horrific flashbacks of being 10 year old, sucking wind, being the one that got nailed by the cops because your friends were all physically fit and managed to get away.&amp;nbsp; Failure.&lt;br /&gt;&lt;br /&gt;If you're going to pussyfoot around and work out for 30 minutes in your &quot;fatburn&quot; zone&amp;nbsp; three times a week, don't even bother.&amp;nbsp; You're just wasting your time.&amp;nbsp; One hour per day, hard.&amp;nbsp; You should be close to vomiting by the end of that hour.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Well, That's It&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Easy, huh.&amp;nbsp; Stop eating so damn much and get off your fat lazy ass.&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>Machine Learning Is Not As Cool As It Sounds</title>
    <link href="http://teddziuba.com/2008/05/machine-learning-is-not-as-coo.html"/>
    <updated>2008-05-14T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/05/machine-learning-is-not-as-coo</id>
    <content type="html">
      I don't like to talk about my job.&amp;nbsp; Don't get me wrong, I like what I do, I just don't like having to explain things to people who are feigning interest.&amp;nbsp; It's a waste of everyone's time.&lt;br /&gt;&lt;br /&gt;When I do have to go into more detail than &quot;I write software&quot;, I sex it up by saying &quot;I write artificial intelligence software for recommendation systems&quot;.&amp;nbsp; Sounds pretty awesome when you say it like that, huh?&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Truthfully, that's like describing a summer job at Burger King as &quot;caloric energy distribution engineer&quot;.&lt;br /&gt;&lt;br /&gt;Yes, one of the things I do is implement machine learning methods for a news recommendation system.&amp;nbsp; The prerequisite amount of pain-in-the-ass, why-did-I-go-to-college-for-this work, though, dominates the cool stuff.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Vector Space Model AI... Sounds Hot&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The idea here is that you turn your data into N-dimensional vectors and let loose a bunch of linear algebra on that shit.&amp;nbsp; In return, you get stuff like classification and clustering.&amp;nbsp; If you want to sound like you know what you're talking about here, you can mention stuff like &lt;i&gt;separating hyperplane&lt;/i&gt;, &lt;i&gt;sigmoid kernel function&lt;/i&gt;, or &lt;i&gt;k-means++&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;I deal mostly in the vector space model.&amp;nbsp; As awesome as all of this sounds, most of the work is a real pain in the balls.&amp;nbsp; Writing a sequential minimum optimization routine for a support vector machine is a good exercise, but it's not useful in practice.&amp;nbsp; Somebody else has already written it for me, and besides, that's not the problem I need to be concentrating on.&lt;br /&gt;&lt;br /&gt;Most of the methods that deal with VSM machine learning are well defined and fairly easy to implement.&amp;nbsp; What remains a mystery, though, is the generation of the input.&amp;nbsp; How you translate your data into vectors is &lt;b&gt;the most important problem to solve&lt;/b&gt;.&amp;nbsp; It's also the most boring. After that, you can worry about shaving 3 nanoseconds off of your dotproduct routine.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Then You Need To Deal With The Academics&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Publish or perish.&amp;nbsp; Yeah, that's cute, but in the real world it's profit or perish, and that means getting useful results.&amp;nbsp; Academics love machine learning because it affords them the opportunity to make slight variations to undergraduate level mathematical procedures, quantify the result, and write it up in LaTeX with graphs and shit.&lt;br /&gt;&lt;br /&gt;It's easy to write a paper showing the effects on precision and recall for a perceptron classifier on the Reuters corpus using normalized vs. non-normalized vectors.&amp;nbsp; It's not easy to generate data as clean as the Reuters corpus from a web crawl.&amp;nbsp; Not only is this task hard, it's about as much fun as chemotherapy.&amp;nbsp; As such, there are no useful papers coming out of academia about how to parse HTML.&amp;nbsp; Unfortunately, problems like these are the ones that need solving.&lt;br /&gt;&lt;br /&gt;When I talked earlier about all of the prerequisite bullshit, this is what I meant.&amp;nbsp; You get the most testicular pain when dealing in text content, and the real deep-rooted ball ache comes from web content.&amp;nbsp; We put a ton of effort into our HTML parsing routines, and it has paid off.&lt;br /&gt;&lt;br /&gt;For reference, altering a method that helped with removing boilerplate content from a web page (&lt;i&gt;boring&lt;/i&gt;) had a greater benefit to the accuracy of our classifier than did dimensionality reduction and normalization combined (&lt;i&gt;sexy&lt;/i&gt;).&amp;nbsp; If you're not picking up what I'm putting down here, I'm saying that the really hard and less &lt;i&gt;science-y&lt;/i&gt; improvements made the machine learning better than any of the shit you would read about in an ACM journal.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;This Is Going Somewhere I Promise&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;This really is just a buttsore blog post, but I'm on a roll now.&lt;br /&gt;&lt;br /&gt;When I am working with the machine learning part of my job, I am rarely working in my development environment.&amp;nbsp; Most of the real stuff is done in Excel.&amp;nbsp; Well, at least it used to be, until I figured out that &lt;a href=&quot;http://www.r-project.org/&quot;&gt;GNU R&lt;/a&gt; is so awesome it makes me want to fuck myself up with a chainsaw.&lt;br /&gt;&lt;br /&gt;When I make a change to the inputs of a machine learning method (support vector machine in this case), I need to verify that the change I just made was actually positive.&amp;nbsp; And since that can't be done with a JUnit test, I have to get all scientific-method on that shit.&amp;nbsp; Remember in college when you snoozed through advanced statistics because it sucked?&amp;nbsp; Yeah, me too.&amp;nbsp; Good thing I kept the book.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>Weekend Science Project</title>
    <link href="http://teddziuba.com/2008/05/weekend-science-project.html"/>
    <updated>2008-05-10T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/05/weekend-science-project</id>
    <content type="html">
      I built an HD antenna today.&amp;nbsp; Using &lt;a href=&quot;http://www.metacafe.com/watch/762088/coat_hanger_hdtv_antenna_better_than_store_bought_amazing/&quot;&gt;these instructions&lt;/a&gt; and less than $15 worth of materials, I can get a few local channels in HD over the air.&amp;nbsp; Check this sucker out:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;antenna.JPG&quot; src=&quot;/images/antenna.JPG&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;346&quot; width=&quot;461&quot; /&gt;&lt;br /&gt;&lt;div&gt;Ugly as hell, but I can watch Lost in HD without paying Comcast an extra dime.&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>JavaFX Native Look and Feel</title>
    <link href="http://teddziuba.com/2008/05/javafx-native-look-and-feel.html"/>
    <updated>2008-05-10T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/05/javafx-native-look-and-feel</id>
    <content type="html">
      I have been toying around with JavaFX, Sun's answer to Adobe AIR and Microsoft Silverlight.&amp;nbsp; Since JavaFX is pretty much an easy way to do Swing, you can get Swing's pluggable look and feel in Java FX programs.&amp;nbsp; Thank Christ, because the Swing UI components look like shit:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;demo-swing.png&quot; src=&quot;/images/demo-swing.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;90&quot; width=&quot;512&quot; /&gt;&lt;br /&gt; &lt;div&gt;Versus the GTK look and feel:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;demo-gtk.png&quot; src=&quot;/images/demo-gtk.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;94&quot; width=&quot;477&quot; /&gt;&lt;br /&gt;This was done by adding the following snippet to my JavaFX code:&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;pre&gt;&lt;code&gt;
      import javax.swing.UIManager;

      UIManager.setLookAndFeel(
      UIManager.getSystemLookAndFeelClassName()); &lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;Oh neat, native UI components.
    </content>
  </entry>

  <entry>
    <title>Eclipse Crashes in Ubuntu Hardy Heron</title>
    <link href="http://teddziuba.com/2008/04/eclipse-crashes-in-ubuntu-hard.html"/>
    <updated>2008-04-26T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/04/eclipse-crashes-in-ubuntu-hard</id>
    <content type="html">
      I just upgraded my workstation to Hardy and got a SIGSEGV when starting Eclipse.&amp;nbsp; It appears to be a bug in the Sun JVM that ships with Hardy, and it only happens on AMD64.&lt;br /&gt;&lt;br /&gt;If you're set on using this runtime, the fix is to disable the JIT compiler by launching Eclipse with -Xint, but that's comes with a severe performance penalty.&lt;br /&gt;&lt;br /&gt;The fix I used was to simply downgrade the Hardy JRE (6-06-0ubuntu1) to the Gutsy version (6-03-0ubuntu2).&amp;nbsp; You'll have to edit /etc/apt/sources.list to add a Gutsy repository.&lt;br /&gt;&lt;br /&gt;I'm pretty sure this is the Sun bug: &lt;a href=&quot;http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6614100&quot;&gt;http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6614100&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;And this is the Ubuntu bug: &lt;a href=&quot;https://bugs.launchpad.net/ubuntu/+source/eclipse/+bug/174759&quot;&gt;https://bugs.launchpad.net/ubuntu/+source/eclipse/+bug/174759&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>I'm Going To Scale My Foot Up Your Ass</title>
    <link href="http://teddziuba.com/2008/04/im-going-to-scale-my-foot-up-y.html"/>
    <updated>2008-04-24T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/04/im-going-to-scale-my-foot-up-y</id>
    <content type="html">
      &lt;img alt=&quot;1205210029413.jpg&quot; src=&quot;/images/1205210029413.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;166&quot; width=&quot;399&quot; /&gt; &lt;div&gt;Engineers love to talk about scalability.&amp;nbsp; It makes us feel like the bad ass, dick-swingin' motherfuckers that we wish we could be.&lt;br /&gt;&lt;br /&gt;After we talk about scalability with our co-workers (&lt;i&gt;Yeah, Rails doesn't scale!&lt;/i&gt;), we flex our true engineering prowess by writing a post about it on our blog.&amp;nbsp; Once that post hits Reddit, son, everyone will know &lt;i&gt;how hardcore&lt;/i&gt; you really are.&amp;nbsp; Respect.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;People Who Talk Big About Scalability Don't Need To Worry About It&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Fact:&amp;nbsp; every chest-thumping blog post I have seen written about scalability is either about architecture, Memcached, or both.&amp;nbsp; Some asshole who writes shitty code starts pontificating about &lt;i&gt;&quot;scalable architecture&quot;&lt;/i&gt; with data storage, web frontends, whatever-the-fuck.&amp;nbsp; Dude, your app isn't having scalability problems because of the &lt;i&gt;architecture&lt;/i&gt;.&amp;nbsp; It's having scalability problems because you coded a ton of N^2 loops into it and you're too self-important to get peer reviews on your commits.&lt;br /&gt;&lt;br /&gt;And let's not forget the tools who discover Memcached for the first time, install it on a web server, and notice how fast their app runs now.&amp;nbsp; Yeah, welcome to the modern age.&amp;nbsp; Hope you know what a cache expiry policy is.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;If You Haven't Discussed Capacity Planning, You Can't Discuss Scalability&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;You don't need to worry about scalability on your Rails-over-Mysql application because nobody is going to use it.&amp;nbsp; Really.&amp;nbsp; Believe me.&amp;nbsp; You're going to get, at most, 1,000 people on your app, and maybe 1% of them will be 7-day active.&amp;nbsp; Scalability is not your problem, getting people to give a shit is.&lt;br /&gt;&lt;br /&gt;Unless you know what you need to scale &lt;i&gt;to&lt;/i&gt;, you can't even begin to talk about scalability.&amp;nbsp; How many users do you want your system to handle? A thousand?&amp;nbsp; Hundred thousand? Ten million?&amp;nbsp; Here's a hint: the system you design to handle a quarter million users is going to be different from the system you design to handle ten million users.&lt;br /&gt;&lt;br /&gt;Of course you'll point to the engineer's wet dream: linear scalability.&amp;nbsp; &lt;i&gt;Lulz but when we get more users we just add more machines you are so stupid ted. uncov sucks.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Yeah, great, well it doesn't exist.&amp;nbsp; Oh no, go ahead and try out Amazon SimpleDB and think to yourself that it will scale linearly.&amp;nbsp; Then, when you get enough users that the latency becomes a problem, blame it on &quot;those shitty Amazon datacenters&quot;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Choosing Technology Don't Mean Shit If You Don't Know How To Use It&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The most common butthurt about scalability is this:&amp;nbsp; choose a technology.&amp;nbsp; If you like the technology, claim &lt;i&gt;&quot;technology X scales better!&quot;&lt;/i&gt; If you don't like it, claim &lt;i&gt;&quot;technology X doesn't scale!&quot;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Saying &quot;Rails doesn't scale&quot; is like saying &quot;my car doesn't go infinitely fast&quot;.&amp;nbsp; Alternatively, saying &quot;We'll have no problems scaling because we're using Django&quot; is like saying &quot;I will win every race because my car is the most powerful&quot;.&amp;nbsp; Maybe so, but you suck at driving, and you're up against professionals.&lt;br /&gt;&lt;br /&gt;If you're having scalability problems and blaming it on a single technology, chances are, you're doing it wrong.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;tl;dr&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Shut up about scalability, no one is using your app anyway.&lt;br /&gt;&lt;/div&gt;
    </content>
  </entry>

  <entry>
    <title>Don't Serialize Java Objects In Hadoop SequenceFiles</title>
    <link href="http://teddziuba.com/2008/04/dont-serialize-java-object-in.html"/>
    <updated>2008-04-08T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/04/dont-serialize-java-object-in</id>
    <content type="html">
      Not if you can avoid it, at least.&lt;br /&gt;&lt;br /&gt;Hadoop provides you with the &lt;code&gt;Writable&lt;/code&gt; interface if you want to write your object to a &lt;code&gt;SequenceFile&lt;/code&gt;.&amp;nbsp; It's up to you to implement the &lt;code&gt;write()&lt;/code&gt; and &lt;code&gt;readFields()&lt;/code&gt; methods for your object.&amp;nbsp; It's easy if your object is simple: just write each of your instance variables to a &lt;code&gt;DataOutput&lt;/code&gt; and read them back in the same order from a &lt;code&gt;DataInput&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Don't Write Your Object As A Serialized Byte Array&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;I got lazy when I was implementing the Writable interface with one of our classes because it had a ton of instance variables.&amp;nbsp; I figured I'd just serialize it to a byte array, then write the array length and the whole array to the DataOutput.&amp;nbsp; And on the read, well, just unserialize the object from the byte array.&amp;nbsp;&amp;nbsp; This was my &lt;code&gt;write()&lt;/code&gt;:&lt;br /&gt;

      &lt;pre&gt;&lt;code&gt;
      @Override
      public void write(DataOutput out) throws IOException {
      ByteArrayOutputStream byteOutStream = new ByteArrayOutputStream();
      ObjectOutputStream objectOut = new ObjectOutputStream(byteOutStream);

      objectOut.writeObject(getContainedObject());
      objectOut.close();

      byte[] serializedObject= byteOutStream.toByteArray();

      out.writeInt(serializedObject.length);
      out.write(serializedModel);

      }
      &lt;/code&gt;&lt;/pre&gt;

      Naw, dude.  Bad idea.&lt;br /&gt;&lt;br /&gt;I knew that I'd be paying some overhead in both space and time for this little scheme, but I didn't know how much.&amp;nbsp; It was just a little bit per object, but when we started seeing MapReductions take way too much time in I/O, it was time to revisit this.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What This Cost In Space And Time&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;First, the Java serialization space overhead.&amp;nbsp; On a toy example of this object, serialization to a byte array used 953 bytes.&amp;nbsp; Properly writing out the instance variables consumed 296 bytes.&amp;nbsp; In production, doing it the right way shrunk a 1,600-record &lt;code&gt;SequenceFile&lt;/code&gt; from 1.4GB to 825MB.&lt;br /&gt;&lt;br /&gt;Time savings were great, too.&amp;nbsp; In the same toy example, it took my JVM 7.2 milliseconds to serialize the object and 1.7 milliseconds to unserialize.&amp;nbsp; Doing with with stream I/O only took 76,000 nanoseconds to serialize, 58,000 nanoseconds to unserialize.&lt;br /&gt;&lt;br /&gt;I love order-of-magnitude improvements.&lt;br /&gt;&lt;br /&gt;Lesson learned: get off your lazy ass and do it right.&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>Plugged in the New York Times</title>
    <link href="http://teddziuba.com/2008/03/plugged-in-the-new-york-times.html"/>
    <updated>2008-03-20T00:00:00-07:00</updated>
    <id>http://teddziuba.com/2008/03/plugged-in-the-new-york-times</id>
    <content type="html">
      Me, Persai, and Uncov got a &lt;a href=&quot;http://www.nytimes.com/2008/03/20/technology/personaltech/20basics.html&quot;&gt;plug in the New York Times&lt;/a&gt; today.&amp;nbsp; We've been in VentureBeat, Slate, and now NYT, but not TechCrunch.&amp;nbsp; Something tells me that's not an accident.
    </content>
  </entry>

  <entry>
    <title>A Magic Elixir</title>
    <link href="http://teddziuba.com/2008/02/a-magic-elixir.html"/>
    <updated>2008-02-27T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2008/02/a-magic-elixir</id>
    <content type="html">
      &lt;img alt=&quot;piglet.jpg&quot; src=&quot;/images/piglet.jpg&quot; class=&quot;mt-image-right&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;253&quot; width=&quot;338&quot; /&gt;In software, there are no silver bullets.&amp;nbsp; In internal combustion engine mechanics, however, there are plenty. And I just discovered one.&lt;br /&gt;&lt;br /&gt;It's called &lt;a href=&quot;http://www.amazon.com/Sea-Foam-Marine-Motor-Treatment/dp/B0002ZVMQO&quot;&gt;Sea Foam&lt;/a&gt;, and it will cure what ails 'ya.&lt;br /&gt;&lt;br /&gt;My wife's first motorcycle was a Honda Rebel 250.&amp;nbsp; She upgraded too late in the season and couldn't sell the starter before winter showed up.&amp;nbsp; Winter time is a dead zone for the used motorcycle market in the San Francisco Bay Area, so the Rebel sat in the parking garage for 5 months.&amp;nbsp; Being lazy, I didn't properly store it.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;We went to fire it up yesterday to prepare it for its 15 minutes of Craigslist fame, and it wouldn't turn over.&amp;nbsp; Gasoline, if left long enough, will degrade into a mucky varnish that cakes the inside of your carburetors.&lt;br /&gt;&lt;br /&gt;I poured half a can of Sea Foam into the tank and let it sit for a few minutes.&amp;nbsp; I cranked it again and it made a few pathetic putts.&amp;nbsp; A few more cranks, a few more putts, but after about 5 tries, the Rebel roared to life.&lt;br /&gt;&lt;br /&gt;A six dollar bottle of some petroleum distillate has the same end effect as a three hundred dollar carburetor job.&lt;br /&gt;&lt;br /&gt;I am detecting much win in this sector.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>Core Dumps Disabled By Default In Ubuntu</title>
    <link href="http://teddziuba.com/2008/02/core-dumps-disabled-by-default.html"/>
    <updated>2008-02-19T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2008/02/core-dumps-disabled-by-default</id>
    <content type="html">
      Enable them with this command:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;ulimit -c unlimited&lt;/code&gt;&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>The Road To Hell Is 64 Bits Wide</title>
    <link href="http://teddziuba.com/2008/02/the-road-to-hell-is-64-bits-wi.html"/>
    <updated>2008-02-14T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2008/02/the-road-to-hell-is-64-bits-wi</id>
    <content type="html">
      Java is an awesome language because you get to ignore hard stuff like memory allocation.&amp;nbsp; Write once, run anywhere.&amp;nbsp; Sweet, where do I sign up?&lt;br /&gt;&lt;br /&gt;The privilege of not having to manage memory comes at a cost: you aren't allowed to question how the JVM works.&amp;nbsp; Move along, coder.&amp;nbsp; Keep making those objects.&amp;nbsp; Don't ask how much memory things take up. In fact, to keep you from getting curious, we're not even going to have a &lt;code&gt;sizeof&lt;/code&gt; function.&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;&lt;br /&gt;How My Complacency Made Me Fail&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;When you're coding in Java, it's easy to buy into this mentality.&amp;nbsp; You never really have to worry about how much space anything takes up, and if you get an &lt;code&gt;OutOfMemoryError&lt;/code&gt;, just give the JVM more memory.&amp;nbsp; Problem solved.&lt;br /&gt;&lt;br /&gt;But there are times when you need to be conscious of how the JVM actually works.&amp;nbsp; For example, when you're trying to squeeze every bit of performance out of the crappiest of machines, such the small Amazon EC2 instances (where Persai is hosted).&lt;br /&gt;&lt;br /&gt;Here's a real world example of how I got burned:&lt;br /&gt;&lt;br /&gt;Persai does a lot of work with high-dimensionality, sparse vectors.&amp;nbsp; To save space, we compact the vectors.&amp;nbsp; Since most of the values in the vectors are zero, we simply do not store them.&amp;nbsp; What we store amounts to a list of the nonzero element indices and the corresponding values.&amp;nbsp; This is our basic data structure:&lt;br /&gt;&lt;br /&gt;
      &lt;pre&gt;class sparseNode {
      public int index;
      public double value;
      }
      &lt;/pre&gt;
      So a vector is an array of &lt;code&gt;sparseNode&lt;/code&gt; objects. Sounds easy enough, and for a while, it was.&amp;nbsp; That is until I was tasked with storing as many of these  in memory as I could.&lt;br /&gt;&lt;br /&gt;Where's the fail here?&amp;nbsp; How big is an &lt;code&gt;int&lt;/code&gt; primitive in Java? 32 bits, right?&amp;nbsp; Sort of.&amp;nbsp; The Java specification says that an implementation must provide 32 bits of workable space for the programmer using an &lt;code&gt;int&lt;/code&gt;, but makes no mention of how how the virtual machine must store this variable.&lt;br /&gt;&lt;br /&gt;In Sun's HotSpot JVM, object storage is aligned to the nearest 64-bit boundary.&amp;nbsp; On top of this, every object has a 2-word header in memory.&amp;nbsp; The JVM's word size is usually the platform's native pointer size.&amp;nbsp; Alright, two words for the object header, one word for the &lt;code&gt;int&lt;/code&gt;, two words for the double.&amp;nbsp; That's 5 words: 160 bits.&amp;nbsp; Because of the alignment, this object will occupy 192 bits of memory.&amp;nbsp; Effectively, the &lt;code&gt;int&lt;/code&gt; value is taking 64 bits!&amp;nbsp; In an array of these things, I've wasted N times 32 bits.&amp;nbsp; Figure, a typical vector is about 200 elements long, so that's 800 bytes out the window for each one.&lt;br /&gt;&lt;br /&gt;This fun fact would have been good to know when I was doing my initial back of the envelope calculation of how many vectors I can fit in a gigabyte of memory.&lt;br /&gt;&lt;br /&gt;Yes, I know I should be complaining about the same thing when using C structs.&amp;nbsp; But you know what?&amp;nbsp; When you learn C, you are introduced to many harsh realities.&amp;nbsp; When you learn Java, you are introduced to XML.&amp;nbsp; They protect you from the hard things.&amp;nbsp; Live and learn, I guess.&lt;br /&gt;&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;The Fix&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;After knowing this, it was easy to rescue myself.&amp;nbsp; Java primitive arrays fall to the same alignment issue, but in our case, they can help solve the problem.&amp;nbsp; Instead of representing a vector as an &lt;i&gt;array of objects&lt;/i&gt;, we'll represent a vector as an &lt;i&gt;object of arrays&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;

      &lt;pre&gt;class sparseVector {
      public int[] indicies;
      public double[] values;
      }
      &lt;/pre&gt;

      This way, we're going to lose at most 4 bytes per vector with the alignment of the &lt;code&gt;int[]&lt;/code&gt; array.&amp;nbsp; This sure beats the ~800 byte loss with the other solution.&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>Why Machine Learning Isn't Getting Any Easier</title>
    <link href="http://teddziuba.com/2008/02/why-machine-learning-isnt-gett.html"/>
    <updated>2008-02-02T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2008/02/why-machine-learning-isnt-gett</id>
    <content type="html">
      I learned the basics of machine learning in college.&amp;nbsp; Classifiers, clustering, all that jazz.&amp;nbsp; Every undergrad computer science major loves this crap because they can understand it with a passing knowledge of linear algebra.&amp;nbsp; &lt;i&gt;Machine learning&lt;/i&gt;.&amp;nbsp; Sounds pretty sexy.&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;&lt;br /&gt;In Practice, It's A Lot Harder Than What You Did In College&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;Up front, let me say that I implemented the vast majority of the machine learning technology behind &lt;a href=&quot;http://www.persai.com/&quot;&gt;Persai&lt;/a&gt;.&amp;nbsp; In the beginning, I thought it was going to be a breeze.&amp;nbsp; Take some documents, turn them into features, train a classifier, and off you go.&amp;nbsp; The harsh reality is that less than one percent of the time I spent on this system went into the &quot;sexy part&quot; of machine learning, and most of &lt;i&gt;that&lt;/i&gt; was done by the guy who wrote the SVM library we use!&lt;br /&gt;&lt;br /&gt;The lion's share of time, and the source of most of the hair-pulling, was spent dealing with the data.&amp;nbsp; Data coming in off of the open internet is &lt;i&gt;dirty&lt;/i&gt;.&amp;nbsp; Conflicting character set declarations, boilerplate removal, duplicate detection: these things will drive you to insanity and back.&lt;br /&gt;&lt;br /&gt;I was fooled by the simplicity when I first learned this stuff.&amp;nbsp; This is what they teach you about vector space model based classifiers:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Turn your data into vectors.&lt;/li&gt;&lt;li&gt;Specify the positive and negative samples.&lt;/li&gt;&lt;li&gt;Train your classifier.&lt;/li&gt;&lt;li&gt;Tune your vectorization scheme and classifier parameters until the classifier is good.&lt;/li&gt;&lt;/ol&gt;What they don't teach you is this: &lt;b&gt;Step 1 is a &lt;i&gt;bitch&lt;/i&gt;&lt;/b&gt;.&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;&lt;br /&gt;Publish Or Perish&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;If turning data into vectors is such a hard problem, why aren't the academics churning out papers about it?&amp;nbsp; Because it's not sexy.&amp;nbsp; There are no numerical nuances to deciding how to handle a document whose declared character set is ISO-8859-1 but is actually encoded in UTF-8.&amp;nbsp; There's no Turing award coming your way for finding a way to make reasonable text out of horrifically malformed HTML that makes you curse Firefox and Internet Explorer for accepting as renderable.&lt;br /&gt;&lt;br /&gt;When I started Persai, I admitted that somebody else has already done the mathematical programming better than I ever could.&amp;nbsp; I didn't spend years of my life studying numerical analysis, so chances are, if I attempted to write my own SVM library, I would fail.&amp;nbsp; So, in the interest of success and avoiding Not-Invented-Here syndrome, I used somebody else's library.&lt;br /&gt;&lt;br /&gt;People have busted my chops for this, too, as if I am somehow less of an engineer if I use a third-party library.&amp;nbsp; However, one thing has become painfully obvious: &lt;i&gt;the quality of a classifier depends much, much more on your ability to sanitize data than on the algorithm you use&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;
    </content>
  </entry>

  <entry>
    <title>Hermetic RPC Unit Testing With Thrift and jMock</title>
    <link href="http://teddziuba.com/2008/02/hermetic-rpc-unit-testing-with.html"/>
    <updated>2008-02-01T00:00:00-08:00</updated>
    <id>http://teddziuba.com/2008/02/hermetic-rpc-unit-testing-with</id>
    <content type="html">
      Unit testing is a pain in the ass.&amp;nbsp; I will admit it, I hate doing it.&amp;nbsp; More often than not, you just write a few obvious JUnit tests that you know will pass and say you're finished.&lt;br /&gt;&lt;br /&gt;Testing code that makes RPC calls is especially discouraging.&amp;nbsp; You'll say &lt;i&gt;&quot;I can't unit test it, it needs to set up an RPC server and that's too complicated for JUnit&quot;&lt;/i&gt;, or, if you're like me, you won't even make up an excuse.&lt;br /&gt;&lt;br /&gt;Of course, this laziness comes back to bite you when the code goes into production, the RPC server throws a one-in-a-million exception, and your entire service bites the dust because you never tested that execution path.&lt;br /&gt;&lt;br /&gt;So, given that you don't like to be woken up at 3AM by sysops when you have been out drinking all night, let's unit test our RPC clients.&amp;nbsp; Let's do it without having to start up an RPC server when the test runs, and it would be nice to be able to have fine-grained control over the RPC methods.&lt;br /&gt;&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;&lt;font style=&quot;font-size: 1em;&quot;&gt;She's Thrifty - She's Just My Type&lt;/font&gt;&lt;br /&gt;&lt;/b&gt;&lt;/font&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;br /&gt;&lt;font style=&quot;font-size: 0.8em;&quot;&gt;This is the Thrift RPC definition we will be using for this example:&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;&lt;/font&gt;
      &lt;pre&gt;service MyRPCService {
      i64 getDocidForUrl(1: string url),
      }&lt;/pre&gt;

      Simple.  We'll be looking up a 64-bit integral document identifier for a given URL.  Our client code will make a decision about the state of the document given that identifier.&lt;br /&gt;&lt;br /&gt;This is the class we will be testing:&lt;br /&gt;&lt;br /&gt;

      &lt;pre&gt;public class ProgramToTest {

      // class constants
      private static final int RPC_SERVER_PORT = 3141;
      private static final String RPC_SERVER_HOST = &quot;rpcserver.teddziuba.com&quot;;
      private static final long DOCID_IS_OLD_IF_LESS_THAN = 1000;
      public static enum DocumentStatus { OLD, NEW, UNKNOWN };

      // instance variables
      private MyRPCService.Iface myRpc;
      private TSocket socket;

      public ProgramToTest() {}

      private void init() throws TTransportException {
      socket = new TSocket(RPC_SERVER_HOST, RPC_SERVER_PORT);
      TProtocol protocol = new TBinaryProtocol(socket, true, true);
      myRpc = new MyRPCService.Client(protocol);
      socket.open();
      }

      public Enum&lt;DocumentStatus&gt; getDocumentStatus(String documentUrl) {
      try {
      long docId = myRpc.getDocidForUrl(documentUrl);
      if (docId &amp;lt; DOCID_IS_OLD_IF_LESS_THAN) {
      return DocumentStatus.OLD;
      }
      return DocumentStatus.NEW;
      } catch (TException e) {
      return DocumentStatus.UNKNOWN;
      }
      }

      public void finished() {
      socket.close();
      }

      }
      &lt;/DocumentStatus&gt;&lt;/pre&gt;

      If I were still in CS class in college, I would get dinged for having multiple &lt;code&gt;return&lt;/code&gt; statements, but the best part about being a grown up is that when I want a cookie, I can have a cookie.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;

      The &lt;code&gt;getDocumentStatus&lt;/code&gt; is really the only thing we need to test, as clients of this class will be responsible for dealing with a &lt;code&gt;TTransportException&lt;/code&gt; if the socket initialization fails.  The unfortunate part about testing that method is that it makes an RPC call. &lt;i&gt;Sockets. Exceptions. Icky.&lt;/i&gt;  Even though it's easier to say screw it and go have a beer, remember: &lt;b&gt;&lt;i&gt;you gotta do what you gotta do&lt;/i&gt;&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;Making a Mockery&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://www.jmock.org/&quot;&gt;JMock&lt;/a&gt; is a clever unit testing library that makes mock objects really easy.&amp;nbsp; If you're new to mock objects, read more about them &lt;a href=&quot;http://www.mockobjects.com/&quot;&gt;here&lt;/a&gt;.&amp;nbsp; The basic idea is that we will make an object that &quot;mocks&quot; the behavior of the RPC server, but without doing any I/O.&amp;nbsp; That way, we have complete control over the operations of the server, and can actually test how your client code interacts with that one-in-a-million exception.&lt;br /&gt;&lt;br /&gt;We'll be mocking out the &lt;code&gt;MyRPCService.Iface&lt;/code&gt; interface that is autogenerated by Thrift, and defining our own behavior for it.  If you've got some experience with JMock, this should be pretty straight forward, and if not, then you'll catch on quick.  JMock's syntax focuses on making the testing conditions human readable.&lt;br /&gt;&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;Prepare The Class For Testing&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;

      Since we will be providing the &lt;code&gt;ProgramToTest&lt;/code&gt; class with a mocked version of this interface, we need to add a constructor to the class for testing only:&lt;br /&gt;&lt;br /&gt;

      &lt;pre&gt;public ProgramToTest(MyRPCService.Iface testOnlyIface) {
      this.myRpc = testOnlyIface;
      }
      &lt;/pre&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;

      JUnit.&amp;nbsp; We In It.&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;We'll test the low-hanging fruit first.&amp;nbsp; Using our mock to control the return value of the RPC call, we can make sure the logic works:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;

      &lt;pre&gt;@Test
      public void testHandlesOldDocid() throws TException {
      final MyRPCService.Iface mockedRpc = context.mock(MyRPCService.Iface.class);
      ProgramToTest testObject = new ProgramToTest(mockedRpc);

      final long rpcCallReturnValue = 100L;
      final String testUrl = &quot;http://www.teddziuba.com/&quot;;

      context.checking(new Expectations() {
      {
      one(mockedRpc).getDocidForUrl(with(equal(testUrl)));
      will(returnValue(rpcCallReturnValue));
      }
      });

      assertEquals(ProgramToTest.DocumentStatus.OLD,
      testObject.getDocumentStatus(testUrl));
      }
      &lt;/pre&gt;

      That is pretty cool.&amp;nbsp; Without a whole lot of effort, we've managed to make a unit test for a method that depends on an RPC server.&amp;nbsp; This test does not require any network I/O and runs very quickly.&amp;nbsp; It can be run in a self-contained environment, like an automated test server.&amp;nbsp; I call this kind of test &lt;i&gt;hermetic&lt;/i&gt;, because nothing outside of the test code can affect its outcome.&lt;br /&gt;&lt;br /&gt;We can also use JMock to test what happens when an exception is thrown.&amp;nbsp; If a Thrift RPC server throws an exception somewhere in its handler method and that exception is not caught server-side, it will be thrown up to the client as a &lt;code&gt;TException&lt;/code&gt;.&amp;nbsp; To simulate this, we simply change one line of the test expectations:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;

      &lt;pre&gt;@Test
      public void testHandlesException() throws TException {
      final MyRPCService.Iface mockedRpc = context.mock(MyRPCService.Iface.class);
      ProgramToTest testObject = new ProgramToTest(mockedRpc);

      final TException rpcException = new TException(&quot;something awful has happened.&quot;);
      final String testUrl = &quot;http://www.teddziuba.com/&quot;;

      context.checking(new Expectations() {
      {
      one(mockedRpc).getDocidForUrl(with(equal(testUrl)));
      will(throwException(rpcException));
      }
      });

      assertEquals(ProgramToTest.DocumentStatus.UNKNOWN,
      testObject.getDocumentStatus(testUrl));
      }
      &lt;/pre&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;

      Go And Do Likewise&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;JMock is an incredibly useful library.&amp;nbsp; If you're a lazy tester like me, it beats the pants off of subclassing.&amp;nbsp; Now, you have no excuse for leaving RPC calls untested.&lt;br /&gt;
      &lt;br /&gt;
    </content>
  </entry>


</feed>